




SIMULATION AND DESIGN OF STORAGE AREA 
NETWORK 















A THESIS SUBMITTED  
FOR THE DEGREE OF MASTER OF ENGINEERING  
DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 





Foremost, I would like to give thanks to my parents, who always support me during 
my master study in NUS. I also thank my parents for teaching me to have a sense of 
balance in my life, which I will treasure for the rest of my life. 
 
No words can express my appreciation to my advisors, Chong Tow Chong and Zhu 
Yao Long, who trained me from a theory student to a system researcher. During my 
working in Data Storage Institute, A-Star, they have put tremendous amount of effort 
to guide me through each step of my M.Eng program, from writing code to writing 
papers, from working on a well-defined project to searching for problems. 
 
I wish to thank my SANSim team members, Wang Chao Yang, Xi Wei Ya, and Soh 








Acknowledgements ........................................................................................................ i 
Summary....................................................................................................................... iv 
List of Figures............................................................................................................... vi 
List of Tables ..............................................................................................................viii 
1 Introduction...................................................................................................... 9 
1.1 Storage Area Network ....................................................................................... 9 
1.1.1 Direct Attached Storage.............................................................. 9 
1.1.2 Network Attached storage......................................................... 10 
1.1.3 Storage Area Network............................................................... 11 
1.1.4 Why SAN? ................................................................................ 12 
1.2 Why Simulation of SAN?................................................................................ 13 
1.3 Problem Statement........................................................................................... 15 
1.4 Thesis Contributions........................................................................................ 15 
1.5 Thesis Organization ......................................................................................... 16 
2 Background .................................................................................................... 17 
2.1 FC SAN and Switch Fabric ............................................................................. 17 
2.2 Standards of Fibre Channel Fabrics................................................................. 19 
2.3 Fibre Channel Topologies................................................................................ 20 
2.4 Fibre Channel Switch ...................................................................................... 21 
2.4.1 Switch Path Selector ................................................................. 21 
2.4.2 Switch Router............................................................................ 22 
2.4.3 Fabric Shortest Path First (FSPF) ............................................. 22 
2.5 Redundant Array of Independent Disks (RAID) Technologies ...................... 23 
2.6 Current SAN Model and Simulation ............................................................... 25 
2.7 SANSim........................................................................................................... 26 
2.8 Summary.......................................................................................................... 27 
3 Simulation Methodology ............................................................................... 28 
3.1 Event Driven Simulation Environment ........................................................... 28 
3.2 High Resolution Timestamp Mechanism ........................................................ 30 
3.2.1 System Software Clock............................................................. 31 
3.2.2 Diagnostic Counter ................................................................... 31 
3.2.3 Timestamp Conversion ............................................................. 31 
3.3 Open System/Close System Models ................................................................ 32 
3.3.1 Open Subsystem Model ............................................................ 32 
3.3.2 Close Subsystem Model............................................................ 33 
3.4 Queue Models in Storage System.................................................................... 34 
3.5 Summary.......................................................................................................... 34 
4 Development of SANSim............................................................................... 35 
4.1 SANSim Overview .......................................................................................... 35 
4.1.1 Memory Allocation  in SANSim .............................................. 36 
iii 
4.1.2 Events in SANSim .................................................................... 38 
4.1.3 Modeling of System Components............................................. 41 
4.1.4 Integration of Network Module to the “Bus” Structure............ 43 
4.2 I/O Workload Module...................................................................................... 45 
4.2.1 Request Parameters and Workload Patterns ............................. 45 
4.2.2 Parameter File ........................................................................... 46 
4.2.3 Implementation of the Workload Generation Module.............. 47 
4.3 SANSim Host Module..................................................................................... 50 
4.3.1 The Upper Layer ....................................................................... 50 
4.3.2 I/O Driver Module .................................................................... 51 
4.3.3 Bus Module............................................................................... 53 
4.3.4 Controller and Adapter ............................................................. 54 
4.4 SANSim Network Module .............................................................................. 54 
4.4.1 Network Simulation .................................................................. 55 
4.4.2 FC_controller Module............................................................... 57 
4.4.3 FC_communication Module ..................................................... 59 
4.5 SANSim Storage Module ................................................................................ 64 
4.5.1 RAID Array Module ................................................................. 65 
4.5.2 Functional View of the RAID Array Model ............................. 67 
4.5.3 RAID Algorithm Process.......................................................... 69 
4.5.4 Functional Flow of RAID Algorithm Process .......................... 70 
4.5.5 Summary of Supporting Schemes............................................. 72 
4.6 Capabilities and Limitations of SANSim ........................................................ 72 
4.7 Summary.......................................................................................................... 73 
5 Experiments and Validation ......................................................................... 74 
5.1 FC SAN Transmission Specifications ............................................................. 74 
5.2 FC SAN Overhead........................................................................................... 76 
5.3 Experimental Environment .............................................................................. 77 
5.4 Performance Metrics and Parameters .............................................................. 78 
5.5 SANSim Simulation Configuration................................................................. 79 
5.6 Comparisons of the Testing Data and Simulation Data................................... 80 
5.7 Summary.......................................................................................................... 83 
6 Application of SANSim ................................................................................. 84 
6.1 Core/Edge Topology........................................................................................ 85 
6.2 Simulation Environment .................................................................................. 85 
6.3 Simulation Results and Analysis ..................................................................... 86 
6.3.1 Throughput Analysis................................................................. 86 
6.3.2 Network Latency Analysis........................................................ 92 
6.4 Summary.......................................................................................................... 94 
7 Conclusion and Future Work ....................................................................... 96 
7.1 Conclusion ....................................................................................................... 96 
7.1.1 SANSim Simulation Tool ......................................................... 96 
7.1.2 Simulation Methodology .......................................................... 97 
7.1.3 Experiments and Validation...................................................... 97 
7.1.4 SANSim Application ................................................................ 98 








Storage systems researchers need Storage Area Network (SAN) simulation tool 
to develop and verify new ideas and algorithms for Fibre Channel (FC) storage area 
networks.  
Currently there are no simulation tools which can simulate the whole SAN 
environment. Most of the simulation studies and tools are only functional simulation 
tool, and have been very limited in modeling and simulation at the FC protocol level. 
In the thesis, SANSim, a new FC SAN simulation and design platform, is 
presented. SANSim is developed to aid the development and verification of the new 
ideas and algorithms for FC storage area networks. It supports FC frame-level 
simulation which fully simulates Fibre Channel protocols in accordance to the relevant 
standards to guarantee the compatibility and interoperability of different modules.  
SANSim includes four main modules: I/O workload module, host module, 
storage network module, and storage system module. I/O workload module generates 
I/O request streams according to the workload distribution characteristics and sends 
them to the host modules. Host module encapsulates the I/O workload to the Small 
Computer System Interface (SCSI) commands and sends them to the Host Bus 
Adaptor (HBA) sub-modules. Network module simulates the network connectivity, 
topology and communication mechanism. The FC network module includes three sub-
modules: FC_controller module, FC_switch module and FC_communication module. 
Storage module maps I/O data to the storage devices. 
SANSim uses event driven simulation methodology. The SANSim simulation 
modules have been validated by comparing the simulation results with the actual I/O 
performance of a FC Random Access Memory (RAM) disk connected to a FC 
v 
network. The results show that SANSim model is accurate with error range less than 
3% for read operation, and less than 10% for write operation. 
SANSim can be used effectively in uncovering hidden protocol design 
problems in existing system. The Core/Edge FC implementation is a typical storage 
network structure in FC SAN network. How to design and optimize the FC network 
design is an important issue. The thesis also provides the detailed analysis of 
Core/Edge network performance matrix based on SANSim. The simulation results 
show that the Core/Edge topology suffers from certain level of bandwidth loss due to 










Figure 1.1: Direct Attached Storage (DAS).................................................................. 10 
Figure 1.2: Network Attached Storage (NAS).............................................................. 10 
Figure 1.3: Fibre Channel SAN .................................................................................... 11 
Figure 2.1: FC Topologies ............................................................................................ 20 
Figure 2.2: FC Switch Module...................................................................................... 21 
Figure 3.1: Events in the Waiting Pool......................................................................... 30 
Figure 4.1: SANSim Internal Structure ........................................................................ 35 
Figure 4.2: The Data Structure of SANSim and the Reserved Memory Area.............. 37 
Figure 4.3: Free Event Pool .......................................................................................... 38 
Figure 4.4: SANSim Simple Disk Memory Allocation ................................................ 41 
Figure 4.5: The Role of Component’s Data Structure .................................................. 42 
Figure 4.6: Bus VS Network Structure ......................................................................... 42 
Figure 4.7: Buspath and Slotpath.................................................................................. 43 
Figure 4.8: FC Network in SANSim............................................................................. 44 
Figure 4.9: Two Main Steps to Generate Repeatable Requests.................................... 47 
Figure 4.10: Generate Block Level Requests for Simulation ....................................... 49 
Figure 4.11: SANSim Host Module.............................................................................. 50 
Figure 4.12: Data Structure in SANSim Host Module ................................................. 51 
Figure 4.13: I/O Drivers Data Structure ....................................................................... 52 
Figure 4.14: Bus Structure ............................................................................................ 54 
Figure 4.15: Modeling of Fibre Channel Network in SANSim.................................... 56 
Figure 4.16: FC Network Modules and Relationship ................................................... 57 
Figure 4.17: DMA Operation........................................................................................ 59 
Figure 4.18: FCAL Data Structure................................................................................ 60 
Figure 4.19: FCAL Operation....................................................................................... 61 
Figure 4.20: LPSM State............................................................................................... 61 
Figure 4.21: The Edge Event of Continuous Order Sets .............................................. 62 
Figure 4.22: BB Credit Flow Control in FC Connection Sub-module ......................... 63 
Figure 4.23: FC_Switch Modeling ............................................................................... 64 
Figure 4.24: Multiple Dimensional RAID Array.......................................................... 66 
Figure 4.25: Functional Block Diagram ....................................................................... 67 
Figure 4.26: RAID Array Queuing Module.................................................................. 68 
Figure 4.27: Functional Block Diagram of the RAID Algorithm Process ................... 70 
Figure 5.1: Real System Overhead Collection Configuration ...................................... 74 
Figure 5.2: Fibre Channel Analyser Trace Format ....................................................... 75 
Figure 5.3:  Example Trace of Simulated Port Transmission....................................... 75 
Figure 5.4: Simulation Configuration ........................................................................... 78 
Figure 5.5: SANSim Simulation Model Configuration ................................................ 79 
Figure 5.6: IOPS and Throughput for R/W Operations with I/O Request Size............ 82 
Figure 6.1: Core/Edge Fabric........................................................................................ 85 
Figure 6.2: Simulation Configuration ........................................................................... 86 
Figure 6.3: Maximal Throughputs under Symmetrical I/O Load for Different Cases . 88 
Figure 6.4: Maximum Bandwidth Sustainable for Symmetrical I/O............................ 91 
vii 
Figure 6.5: Traffic Delivered to Servers through ISL1-ISL4 ....................................... 91 
Figure 6.6: Traffic Delivered from Storages thought ISL5-ISL8 ................................. 91 
Figure 6.7: I/O Response Time Measured from Servers for Different Device Accesses
................................................................................................................................ 92 








Table 1: Summary of SANSim and Other Simulators.................................................. 27 
Table 2: Supporting Mapping and Redundancy Schemes ............................................ 72 
Table 3: System Initiator HBA Overhead & Control Constant .................................... 76 
Table 4: FCP Target Overhead & Control Constant..................................................... 77 









Chapter 1   Introduction  
 
1.1  Storage Area Network  
Data storage is very important in today’s information world [5]. Online data 
storage doubles every 9 months [6] due to an ever-growing demand for networked 
information services [7, 8, 9]. In general, storage architectures have evolved from 
Direct Attached Storage (DAS), to Network Attached Storage (NAS) [10, 11, 12, 13] 
and Storage Area Network (SAN) [14, 15, 16]. 
1.1.1 Direct Attached Storage 
Direct Attached Storage (Figure 1.1) is a traditional server storage architecture, 
in which storage devices are part of the host computer. Network workstations must 
therefore access the server in order to connect to the storage device. The storage has to 
be added specifically for a server, even though an adjacent server may still have 
available storage capacity. DAS is difficult to manage, and it does not provide 








Figure 1.2: Network Attached Storage (NAS) 
1.1.2 Network Attached storage 
Network Attached Storage (Figure 1.2) architecture allows a storage 
system/device to be directly connected to a standard network, typically via Ethernet. 
Clients in the network can access the NAS directly. A NAS based storage subsystem 
has built-in file system to provide clients with file system functionality.  
However, NAS has many of the same limitations as DAS. As the first NAS box 
reaches its capacity limit, the NAS head becomes a bottleneck and a single point of 
failure for a significant amount of storage. The user must then add a second NAS head 
and begin to populate that subsystem. In many ways, adding the second NAS head 
11 
introduces the same complexity that IT struggles with a DAS environment – each NAS 
head must be managed separately. As the number of NAS heads grow, the problem 
gets worse. NAS also suffers from one of the key drawbacks of DAS, inefficient 
provisioning. Capacity cannot be re-allocated between NAS subsystems to balance out 
the storage usage, leaving some subsystems under-utilized and others near or at 
capacity. 
1.1.3 Storage Area Network 
Storage Area Network (Figure 1.3) technology provides a simple block level 
interface for storage devices. SAN operates behind the servers to provide a common 
link between servers and storage, allowing administrators to independently scale the 
storage or server processing power as requirements demand. It allows multiple servers 
to access the same data so that duplication of information can be reduced, and permits 
data backup to take place directly over storage channels, eliminating the bottleneck of 
the relatively slow LAN.  Data is also more consistently available, as the failure of a 

















  Figure 1.3: Fibre Channel SAN 
12 
1.1.4 Why SAN? 
The highly scalable nature of a SAN makes the network of servers and storage 
devices interconnected with Fibre Channel hubs and switches a hot topic. With the e-
commerce explosion, many companies are feeling the growing pains associated with 
data management. Providing sufficient disk space is only part of the equation. In fact, 
storage is so cheap that IT administrators give little thought to adding 100 GBs here 
and there when their servers run low on disk space. But as the server farms grow, the 
overhead associated with directly attached storage balloons out of control, causing 
administrators to manage data reactively.  
In a server clustered environment, it makes sense to consolidate storage. To 
attain high-availability benefits of a cluster, the storage subsystem must have high 
availability. By sharing storage pools in a SAN, servers sharing data sets can fail over 
seamlessly. The high speed Fibre Channel infrastructure between storage pools can 
reduce disaster recovery time from several hours to less than a few minutes. SAN can 
potentially offer the following benefits:  
• Simplified centralized management: Single image of storage media 
simplifies storage management. 
• Improvements to application availability: Storage is independent of 
applications and accessible through multiple data paths for better reliability, 
availability, and serviceability.  
• Higher application performance: Storage processing is off-loaded from 
servers and moved onto a separate network.  
• Centralized and consolidated storage: Simpler management, scalability, 
flexibility, and availability.  
13 
• Data transfer and vaulting to remote sites: Remote copy of data enabled 
for disaster protection and against malicious attacks.  
1.2 Why Simulation of SAN? 
With ever increasing in the amount of data generated in the world, efficient 
storage and management of information has become the focus of many intense 
researches. Many research works have been conducted on storage technology, storage 
networking, and storage subsystem. All those works eventually are trying to get better 
performance in terms of throughput, latency and bandwidth. The performance analysis 
becomes a key research topic, which is needed to predict, assess, evaluate and explain 
the system’s performance characteristics. There are several approaches commonly 
being used to do the performance analysis on storage system. 
One approach is analytical modeling, which is trying to predict performance 
variables as a function of parameters of the workload, storage component, and system 
configuration by writing mathematical equations. The analytical modeling can provide 
insight into the steady state performance of the system while it usually provides only 
an upper bound on the possible performance. This approach usually needs queuing 
theory and Markovian analysis, which requires extensive knowledge of probability 
theory. In addition, analytical modeling requires skill at approximating the storage 
system with simplified mathematical models. Such approximations are error-prone and 
difficult to check. 
Another approach is to perform testing and collect measurements of a running 
system. By analyzing the relationship between the workload characteristic, storage 
system component and the resulting performance characteristics, researcher/developer 
will be able to identify problems and give necessary insight to make decisions on 
14 
purchasing and configuration for storage system. However, it could only be done after 
the system is actually available.  
A third approach is simulation, in which a computer program implements a 
simple model of the behavior of the components of the storage system, and then a 
synthetic or actual workload is applied to the simulation program, so that the 
performance of the simulated components and system can be measured. Simulation 
can provide a view of the system behavior at any level of detail, provided enough 
modeling manpower is available. Trace driven simulation is an approach that controls 
a simulation model by feeding in a trace, a sequence of specific events at specific time 
intervals. The trace is typically obtained by collecting measurements from an actual 
running system. The main disadvantage of trace driven is that errors can enter because 
of the absence of modeling of the interfaces between different layers of abstraction. 
Studies and researches on performance issues of SAN are often done by 
simulation [4, 17], especially for evaluating a new algorithm or a new idea. Simulation 
allows independent researchers to verify their new ideas against old common 
implementation without making a real system which usually requires many people 
equipped with various skill sets. The other consideration is the cost. Setting up a real 
system usually would involve a great amount of expense. For example, when studying 
the FC SAN performance, implementing of the FC SAN component will be very 
costly. Conversely, the simulation requires only a general purpose computer, and a 
special design program to simulate the behaviors of these components. 
However, current simulation tools have been very limited in modeling and 
simulation at the FC protocol level. It is necessary to simulate at the frame level FC in 
order to monitor and analyze details of FC SAN activities. With the frame level FC 
simulation tool, independent researchers can verify their new algorithms or ideas 
15 
against old common implementation. Also we can study real FC SAN design 
problems, e.g. the Core/Edge FC implementation problem etc. These are the 
motivations of SANSim development. 
1.3 Problem Statement 
I. Storage systems researchers need SAN simulation tool to develop and verify 
new ideas and algorithms for FC storage area networks. Current simulation 
tools are very limited in their capability to model and simulate at the FC 
protocol level. It is necessary to simulate at the frame-level FC in order to 
monitor and analyze details of FC SAN activities. This issue is addressed in 
the thesis, by SANSim, a storage area network simulator. 
II. The capacity planning, design and optimization of the FC network is an 
important issue. This issue is also addressed by SANSim. 
SANSim provides workable simulation platform for storage area network. The 
modules of SANSim are validated though experiments. SANSim can be used to study 
and design SAN and also to develop new algorithms for SAN configuration, 
deployment and usage. 
1.4 Thesis Contributions 
This thesis makes four main contributions: 
• Simulation methodologies 
• Workable simulation platform for Storage area network 
• Validation of the SANSim simulation model 
• SANSim simulation application 
This thesis proposes a new SAN simulation tool, SANSim [1, 2, 3], which is 
based on system level models, and simulates at the FC frame level. The FC simulation 
16 
module has been validated by comparing the simulation results with the actual I/O 
performance of a FC RAM disk connected to a FC network. The simulated results 
match the experimental readings within 3% for read and 10% for write. SANSim can 
be used effectively in uncovering hidden protocol design problems in existing system.  
For example, the thesis provides a detailed analysis of Core/Edge network 
performance matrix based on SANSim. The simulation results show that the core/edge 
topology suffers from certain level of bandwidth loss due to the Head-of-Line blocking 
caused by traffic crossing multi-hop routes involving several switches.  
1.5 Thesis Organization 
The remainder of the thesis is organized as follows. Chapter 2 gives a 
background review on SAN and SAN simulation studies and tools. Chapter 3 
introduces SANSim simulation methodology including event driven simulation 
environment, memory allocation and data structure, modeling of system components 
etc. Chapter 4 gives detailed descriptions of SANSim’s four main modules structures 
and their relationships. Chapter 5 shows the validation of SANSim I/O workload and 
FC network module. Chapter 6 gives one SANSim application: studying the impact of 
link failure on performance of FC network with a Core/Edge topology. Chapter 7 








Chapter 2   Background  
Storage Area Network is a high speed network that allows the establishment of 
direct connections between storage devices and servers within the distance supported 
by FC. SAN allows applications that move data to perform better; also enable new 
network architectures where multiple hosts access multiple storage devices connected 
to the same storage network. 
Storage system researchers need SAN simulation tools to develop and verify 
new ideas and algorithms for FC storage area networks. Currently there are no 
simulation tools which can simulate the whole SAN environment. Most of the 
simulation tools are limited to functional simulation, and cannot support modeling and 
simulation at the FC protocol level. 
2.1 FC SAN and Switch Fabric 
Fibre Channel is a new serial interface defined by the ANSI (American 
National Standard Institute) as an open industry standard, but it has attained the 
dominant position in the Storage Area Networks. It is generally characterized with 
18 
high speed, long distance, and high scalability for both storage channel users and 
network users. It provides a general transport vehicle for Upper Level Protocols (ULP) 
such as SCSI. The SCSI mapping is defined in FCP (Fibre Channel Protocol for SCSI) 
and has been the most frequently used for storage application. 
A storage area network is a high speed special purpose network (or sub 
network) that interconnects different kinds of data storage devices with data servers on 
behalf of a larger network of users. Typically, SAN is part of the overall network of 
computing resources for an enterprise. 
Fibre Channel is presently the dominant protocol used in SAN. The Fibre 
Channel Standard (FCS) defines a high speed data transfer interface that can be used to 
connect together workstations, mainframes, supercomputers and storage devices etc.  
Fibre Channel ports can be connected as point-to-point links, in a loop or to a 
switch. The ports in a point-to-point connection are called N_Port; if they can work in 
a loop they are called NL_Port. A FC switch, or a network of switches, is called a 
fabric. Its ports are called F_Port. 
Information can flow between two ports in both directions simultaneously. 
Exchange is the name of the mechanism for coordinating the exchange of information 
between two N_Ports. The data is sent in frames that are maximum 2148 bytes long. 
Frames have a header and a checksum. A set of related frames for one operation is 
called a Sequence. For flow control the Fibre Channel standard uses a look-ahead, 
sliding-window scheme that also provides a guaranteed delivery capability. FCS has 
the ability to carry multiple existing protocols including IP and SCSI. 
Fibre Channel networks establish a dedicated communication channel to 
transfer information. This channel may span multiple switches using several inter-
switch links (ISLs). The duration of a connection varies based on the type of data 
19 
transfer in progress. Fibre Channel has a standard routing protocol called Fabric 
Shortest Path First (FSPF). Using the FSPF protocol the fabric decides which ISLs will 
be used to establish the communication channel between the two nodes. FSPF requires 
that the computational algorithm chooses the minimum cost path between two 
switches. The communication channel is then established over this path.  
2.2 Standards of Fibre Channel Fabrics 
The Fibre Channel industry has made great efforts to publish standards to aid 
the industry in reaching heterogeneous interoperability. Technical Committee T11 [27] 
defines FC standards which cover many aspects of Fibre Channel. 
The two main protocol standard documents are FC-SW (Fibre Channel Switch) 
and FC-SW-2 [26].  
Basic fabric operations which are defined in the FC-SW standard include: 
• E_Port Operation 
• Principal Switch Selection 
• Address Assignment 
• Self Configuration of E_Port, F_Port, and/or FL_Ports. 
Enhanced fabric operations which are defined in FC-SW-2 include: 
• Exchange Switch Capabilities - Exchange of Switch Operational 
Capabilities 
• B_Port - Bridge Port Operations 
• Path Selection 
• Distributed Server Communication 
• Exchange of Zone Information 
• Distribution of Registered State Change Notifications (RSCNs) 
20 
2.3 Fibre Channel Topologies 
 
A Fibre Channel network can be configured in three different topologies as 
shown in Figure 2.1. For Point-to-Point Connection, two N_Port devices connect 
together for direct communication. Arbitration Loop is a ring topology where multiple 
NL_Port devices are daisy chained together to create a complete loop. An Arbitration 
Loop may be connected to a fabric using a FL_Port. A Fabric topology is a collection 












(c) Switched Fabric 
 
Figure 2.1: FC Topologies 
 
21 
2.4 Fibre Channel Switch 
 




Figure 2.2: FC Switch Module 
 
2.4.1 Switch Path Selector 
The switch path selector performs path selection based on a metric weighting 
computation algorithm. It utilizes the fabrics link state database as its source of 
information. The link state database is a combination of FSPF protocol structure: Link 
State Update (LSU), Link State Records (LSR) and Link Descriptors (LD). The link 
state database also includes a collection of status reports from all the switches in the 
fabric. When a physical change in the fabric occurs, a new link state database is 
created and distributed throughout the fabric. Each switch uses this database and the 
path selection algorithm to define the path, more precisely the least cost path, to all 
other switches in the fabric. There is one exception to this rule, if the fabric is FSPF 
segmented using autonomous regions it will only discover paths to switches within and 
adjacent to the autonomous region. The switch runs the path selection algorithm when 
22 
it is notified of a physical change to the fabric. It is notified through the process of 
receiving a new or updated LSU. 
2.4.2 Switch Router 
The router is responsible for controlling movement of the Fibre Channel frames 
through the fabric. A switch builds the routing table utilizing the information from the 
path selector. When a frame comes into a switch, it contains a destination address. If 
the router does not contain an entry for that address, it utilizes the path information to 
look up the least cost path to that destination address. The router now identifies and 
makes an entry, for that address, to the next switch along the destination path. The 
switch can now pass the frame towards the destination address. In most switches, the 
routing table is built dynamically. When the new address is found, it looks up the path 
information and creates a new routing table entry for that address. The next time that 
address is used, the routing table already has an entry for that address and does not 
need to perform a path selection lookup. 
2.4.3 Fabric Shortest Path First (FSPF) 
FSPF is a link state path selection protocol, first proposed by Brocade 
Communications [51], based on metric weighting path selection algorithms. The 
standard does not define an algorithm that must be used but simply states that the path 
computation algorithm must yield the path that minimizes the cost of the traversed 
links. The protocol has been accepted as the Fibre Channel Switched Fabrics path 
selection protocol and is defined in the FC-SW-2 standard. FSPF has four major 
components: 
23 
• A hello protocol, used to establish connectivity with a neighbor switch, 
to establish the identity of the neighbor switch, and to exchange FSPF 
parameters and capabilities; 
• A replicated topology database, with the protocols and mechanisms to 
keep the databases synchronized across the fabric; 
• A path computation algorithm; 
• A routing table update. 
Four components communicate each other, throughout the fabric in the 
physical topology. Each switch collects information about its adjacent fabric 
connections. The information from each switch is then combined in a link state 
database. This database, which defines the physical topology of the fabric, is then 
distributed throughout the fabric. Each switch runs a path selection algorithm using the 
link state database to determine the available path across the fabric. The switch then 
uses the path information to build its routing tables. 
FSPF discovers paths to switches using Domain ID's. In general, a Domain ID 
can be a switch or a collection of switches, but for FSPF a Domain ID identifies a 
single switch. 
 
2.5 Redundant Array of Independent Disks (RAID) Technologies 
RAID is a category of disk drives that employ two or more drives in 
combination for fault tolerance and performance.  
There are number of different RAID levels:  
• Level 0 -- Striped Disk Array without Fault Tolerance: Provides data 
striping (spreading out blocks of each file across multiple disk drives) but no 
24 
redundancy. This improves performance but does not deliver fault tolerance. 
If one drive fails then all data in the array is lost.  
• Level 1 -- Mirroring and Duplexing: Provides disk mirroring. Level 1 
provides twice the read transaction rate of single disks and the same write 
transaction rate as single disks.  
• Level 2 -- Error-Correcting Coding: Not a typical implementation and rarely 
used. Level 2 stripes data at the bit level rather than the block level.  
• Level 3 -- Bit-Interleaved Parity: Provides byte-level striping with a 
dedicated parity disk. Level 3 cannot service simultaneous multiple requests.  
• Level 4 -- Dedicated Parity Drive: A commonly used implementation of 
RAID. Level 4 provides block level striping (like Level 0) with a dedicated 
parity disk. If a data disk fails, the parity data is used to create a replacement 
disk. A disadvantage to Level 4 is that the dedicated parity disk can create 
write bottlenecks for some applications. 
• Level 5 -- Block Interleaved Distributed Parity: Provides data striping at the 
block level and also stripe party information. This results in excellent 
performance and good fault tolerance. Level 5 is one of the most popular 
implementations of RAID.  
• Level 6 -- Independent Data Disks with Double Parity: Provides block level 
striping with parity data distributed across all disks.  
• Level 0+1 -- A Mirror of Stripes: Not one of the original RAID levels. Two 
RAID 0 stripes are created, and a RAID 1 mirror is created over them. Used 
for both replicating and sharing data among disks.  
25 
• Level 10 -- A Stripe of Mirrors: Not one of the original RAID levels. 
Multiple RAID 1 mirrors are created, and a RAID 0 stripe is created over 
these. 
2.6 Current SAN Model and Simulation 
Xavier [18, 19] used the CSIM language to simulate a SAN and model its 
activities. They analyzed the key architectural switch characteristics for building Fibre 
Channel storage area networks. They did performance analysis of wormhole switch 
architectures and virtual cut-through (VCT), identified their strongest and weakest 
points, and took advantage of best features from both of them. In the paper, they 
proposed a switch architecture that doubles network throughput while reducing 
response delay. Using the SAN model, they also analyzed by simulation the 
performance degradation of FC SAN when failures in links occur, quantifying how 
much the global SAN performance is reduced during the time the system remains in 
the degraded state. But the simulation model did not model and simulate SAN at the 
FC protocol level. In addition, it does not model the storage module. 
Petra et al. [20] presented a simulation environment SIMLAB for storage area 
networks. SIMLAB is used to develop and verify distributed algorithms for storage 
area networks and is based on the assumption that the SAN consists of active routers. 
The simulation environment also did not model and simulate SAN at FC protocol 
level. 
Wilkes [29] used the Pantheon storage-system simulator to model the 
performance of parallel disk arrays and parallel computers. Pantheon included I/O 
processing (generating and routing IOitems), Interconnects (Links, Routes, and 
DMAengines), Processors, Caches, Disks (disk mechanisms and controllers), 
26 
DiskSimple and ArraySimple etc. Pantheon also used a lot of support libraries, like 
Raphael, Lintel etc. But Pantheon did not have FC network model.  
DiskSim [22] is another disk storage system simulator to support research in 
storage subsystem algorithm and architecture. It includes modules that simulate disks, 
intermediate controllers, buses, device drivers, request schedulers, disk block caches, 
and disk array data organizations. The disk drive module simulates modern disk drives 
in great detail and has been carefully validated against several production disks. 
DiskSim can be used in simulation of disk storage system instead of storage area 
network. 
SANmetrix [31] and SANTK [32] are the other toolkits to facilitate SAN 
design and development. SAN designers can use them to model, prototype, prove and 
manage SAN designs. But SANTK cannot be used to develop and validate the new 
idea and algorithms for storage area networks. Also it cannot simulate SAN at the 
frame-level FC. 
2.7 SANSim 
Storage technology moves from enterprise storage to network storage, and it 
will keep moving from network storage to intelligent storage. During this movement, 
new ideas and new algorithms form the core of the network storage technology. They 
are drivers of the network storage. The simulation studies and tools mentioned above 
are very limited in modeling and simulation at the FC protocol level. Nevertheless, it is 
necessary to simulate at the frame-level FC in order to monitor and analyze details of 
FC SAN activities. Also SAN simulation tool needs synthesizing representative I/O 
workloads. SANSim is developed to aid the development and verification of the new 
ideas and algorithms for FC storage area networks. It will enable researchers to 
27 
conduct research on network storage technology. Table I gives the summary of 
SANSim and other simulators. 
 
Table 1: Summary of SANSim and Other Simulators  
 
Simulator 
Feature  SANSim Pantheon SIMLAB DiskSim SANmetrix SANTK
Support SAN 
simulation √ X √ X √ √ 
Support FC 




√ √ √ √ X X 
Support SAN 
design √ X √ X √ √ 
 
2.8 Summary  
In this chapter, we have presented a basic background of  SAN, FC and RAID. 
Some other SAN simulation works have been discussed. Most of them cannot support 








Chapter 3   Simulation Methodology 
SANSim uses event driven method to develop the simulation model. The 
simulation model is designed through a bottom-up process and following the virtual 
device concept. Storages and networks in SANSim are virtualized to upper level in the 
virtual device concept. This modular design increases the portability of the modules. 
Some modules are implemented as global supporting sub-modules so that the higher 
level modules can use them as supporting sub-modules. 
3.1 Event Driven Simulation Environment 
Many modeling and simulation tools are based on the discrete event simulation 
concept. In this approach, communication processes are decomposed into many states 
that can make a transition to different states depending upon the characteristics of the 
triggering events. Among several public domain tools, Network Simulator (NS) [46] 
provides a modeling and simulation environment for protocol development with 
substantial support for simulation of Transmission Control Protocol (TCP), routing, 
and multicast protocols. National Institute of Standards and Technology (NIST) 
29 
Asynchronous Transfer Mode (ATM) simulator [47] is particularly suitable for 
performing research in ATM network. Communication System Simulator (CSIM) [48, 
49] is a flexible and general purpose discrete event simulation language for developing 
process-oriented simulation models. NetSim[50] is an event driven simulator for 
packet networks with a simple X window interface to allow interactive use.  
SANSim also uses event driven simulation environment. The event driven 
simulation is done by constructing a series of events with various timestamp. The 
timestamp indicates the time order of events occurs during the simulation. When the 
time of an event is equal to the global simulation time, the event is processed. Various 
predefined software functions are called according to the event type. The databases are 
updated and new events are issued for indicating the consequent activity of the system 
behavior. 
Figure 3.1 shows the global event pool of future events. These events are sorted 
in a time ascending order and are maintained in a double link list. A global event 
pointer points to an immediate next event. When an event is added to this pool, it will 
be inserted into the right position according to its timestamp. The simulator fetches the 
event pointed by the global event pointer when it completes processing the previous 
event, and the global simulation time is updated. The global event pointer points to the 
next event. 
30 
Figure 3.1: Events in the Waiting Pool 
With this, SANSim is able to simulate the behavior of SAN system through 
processing the events according to their different timestamp. The statistical data could 
be collected during the execution of the event. 
3.2 High Resolution Timestamp Mechanism 
Effective measurement of storage system activity, such as tracing I/O requests 
and/or other system software events, requires long duration and high resolution 
timestamps. Duration refers to the maximum difference between two timestamps that 
can be effectively measured without external assistance. Resolution refers to the 
smallest granularity difference between any two timestamps.  
Each timestamp consists of two values, which are sampled copies of the 
internal system software clock and a small diagnostic counter. The former provides 
long duration time measurements at a granularity of approximately 10 milliseconds. 
The latter provides short duration (about 55 milliseconds) time measurements at a 
Event 7 
Event 8 






















granularity of approximately 850 nanoseconds. The two values are combined to 
achieve both long duration and high resolution. 
3.2.1 System Software Clock 
The system software maintains a rough view of time, relative to system 
initialization, via a single 32-bit unsigned integer referred to as the system software 
clock. This integer is set to zero during system initialization and is incremented once 
for each clock interrupt (i.e., during the clock interrupt service routine). 
 
3.2.2 Diagnostic Counter 
The diagnostic counter can be safely used without interfering with system 
functions. Its behavior and construction are similar to those described above for the 
clock interrupt generator. The relevant logic contains a 16-bit working counter, a 
restart value and control bits. For timing purposes, we modified the system software to 
initialize the control bits and the restart value. Thereafter, the working counter 
repeatedly cycles through its full range, from 65535 to 0, without generating any 
interrupts. When the value reaches zero, the restart value (65535) is immediately 
reloaded. The duration of the counter is therefore 65535 cycles of the clock, or 
approximately 55 milliseconds, and the resolution is the inverse of the clock frequency 
(i.e., 838.574 nanoseconds). 
3.2.3 Timestamp Conversion 
The elapsed time between any two timestamps can be determined by 
manipulating the corresponding values, combining the best quality of each component 
(i.e., duration and resolution). In particular, the diagnostic counter values are used to 
identify the fine grain difference between the timestamps. The system software clock 
32 
values are used to determine how many times the diagnostic counter wrapped between 
the two timestamps. The following pseudo-code demonstrates the process: 
int finediff = timestamp1.hires - timestamp2.hires 
int loresdiff = timestamp2.lores - timestamp1.lores 
int turnovers = round(((loresdiff x 11932) - finediff) / 65535) 
int hiresticks = finediff + (turnovers x 65535) 
float millisecs = hiresticks x 0.838574 
This combination of a long duration, low resolution values and short duration, 
high resolution values succeeds without error so long as the duration of the latter more 
than doubles the maximum error in the former. The resolution of the system software 
clock places a lower bound on the corresponding error (i.e., 10 milliseconds).  
3.3 Open System/Close System Models 
SANSim supports both open system and close system. 
3.3.1 Open Subsystem Model 
Open subsystem models use predetermined arrival times for requests, and they 
are independent with the storage subsystem's performance. In an open subsystem 
model, there is no feedback between individual request response times and subsequent 
request arrival times. If the storage subsystem cannot handle the incoming workload, 
the number of outstanding requests will grow without bound.  Now storage system is 
overloaded. 
There is no performance/workload feedback in open subsystem models. It 
ignores real systems' tendency to regulate the storage workload based on storage 
performance. That is, when the storage subsystem is overloaded, the host system will 
spend more time waiting for it, instead of generating additional work for it. One effect 
33 
of this problem is that the workload generator for an open subsystem model may allow 
requests to be outstanding concurrently that would never in reality be outstanding at 
the same time (e.g., the read and write requests that comprise a read-modify-write 
action on some disk block). 
3.3.2 Close Subsystem Model 
In a close subsystem model, a request arrival time depends entirely upon the 
completion time of a previous request. Close subsystem model maintains a constant 
population of requests. Whenever the completion is reported for a request, a new 
request is generated and issued into the storage subsystem. That is, close subsystem 
models assume unqualified feedback between storage subsystem performance and the 
incoming workload. The close queuing models which maintains a constant number of 
outstanding requests is one example of close subsystem models. 
The main problem with close subsystem models is that the burst arrival stream 
is difficult to handle. Measurements of real storage subsystem workloads have 
consistently shown that arrival patterns are steady, consisting of occasional periods of 
intense activity interspersed with long periods of idle time (i.e., no incoming requests). 
With a constant number of requests in the system, there is no burst arrival stream. 
Workload scaling in a close subsystem model can be accomplished by simply 
increasing or decreasing the constant request population. If non-zero think times are 
used, workload scaling can also be accomplished by changing the think times, but this 
approach would be less exact because of the feedback effects, which are independent 
of the think times. 
34 
3.4 Queue Models in Storage System 
Because request queues and the corresponding schedulers can be present in 
several different storage subsystem components (e.g., device drivers, intelligent 
controllers and disk drives), request queue/scheduler functionality is implemented in 
the individual modules. New requests are referred to the queue module by queue-
containing components. When such a component is ready to initiate an access, it calls 
the queue module, which selects (i.e., schedules) one of the pending accesses 
according to the configured policies. When an access completes, the component 
informs the queue module. In response, the queue module returns a list of requests that 
are now complete. This list may contain multiple requests because some scheduling 
policies combine sequential requests into a single larger storage access. The queue 
module collects a variety of useful statistics in SANSim (e.g., response times, service 
times, inter-arrival times, idle times, request sizes and queue lengths), obviating the 
need to replicate such collection at each component. 
3.5 Summary 
SANSim uses event driven method to develop the simulation model. Using 
high resolution timestamp mechanism, SANSim can simulate detailed FC SAN 
activities. SANSim supports both open system and close system. Different queue 











Chapter 4   Development of SANSim 
This chapter gives an overview of SANSim development and its main modules: 
I/O workload module, Host module, Network module and Storage module.   
4.1 SANSim Overview  
 




Based on the functional view of the storage area network, SANSim [1] models 
the entire SAN into four modules: I/O workload module, Host module, Network 
module and Storage module (see Figure 4.1). 
I/O workload module generates I/O request streams according to the workload 
distribution characteristics and sends them to the host modules. Host module 
encapsulates the I/O workload to the SCSI commands and sends them to the Host Bus 
Adaptor (HBA) sub-modules. Network module simulates the network connectivity, 
topology and communication mechanism. The FC network module includes three sub-
modules: FC_controller module, FC_switch module and FC_communication. Storage 
module maps I/O data to the storage devices. 
Before looking into detailed simulation modules, we first discuss some 
programming concept: memory allocation and data structure, network and bus in 
SANSim. 
4.1.1 Memory Allocation in SANSim 
In order to handle the simulation of memory allocation in a controlled way, 
SANSim allocates a sufficient length of memory reserved for the simulation at the 
beginning of the SANSim running. The initialization routine will construct a global 
data structure SANSim (see Figure 4.2). SANSim structure holds all information about 
successive data structures, such as device drivers, networks/bus, and hard disk drives. 
It also holds other information which will be used during the simulation, such as the 
simulation time, the global event pool, etc. The start address of the reserved memory 
area is updated to a field of the SANSim, sansim->start_address. Another integer 




Figure 4.2: The Data Structure of SANSim and the Reserved Memory Area 
Function SANSIM_malloc (int size) is developed to handle the memory 
allocation from the reserved memory area, taking the length of required memory as 
parameter. It returns the memory address of the sansim->start_address + sansim-
>current_offset and updates the sansim->current_offset by increasing value of “size”. 
If the updated value exceeds the total length of the reserved memory area, a fatal error 
will be prompted and the simulation terminates. 
During the initialization, most of the data structures are constructed within the 
reserved memory area using SANSIM_malloc() function. The start address of these 
data structures are updated to relative data structures member of SANSim global data 
structure. Most of these data structures are representing entities of system components, 
such as device driver, networks, buses, hard disk drives etc. Some of the data 
structures may be constructed as an array for easy location in case of multiple entities 






















4.1.2 Events in SANSim 
During the simulation, there are many events to be scheduled. Each of the 
system entities may issue one or more events. In order to control the system memory, 
we manage to handle these events in a controlled way (see Figure 4.3). We design a 
general free event pool and have a pointer to point to this event pool, sansim->extraq. 
During the initialization, a number of events are allocated from the reserved memory 
area, and are linked together and updated to the sansim->extraq. The function of 
getfromextraq() retains an event and returns the first address of the event. When the 
number of the free events runs low, the getfromextraq() automatically allocates 
additional number of free events from the reserved memory, and adds these events to 
the free event pool. After event execution, the event may be reused for scheduling next 
event by changing the event type and time and updating other information. If the event 
is not reused, it is returned to the free event pool pointed by sansim->extraq through 
the function addtoextraq(). 
 

























sansim->intq is the global event pointer pointing to the internal event pool. 
Two separated functions (addtointq() and getfromintq()) are designed to handle the 
insertion and fetching of the events. Function addtointq() requires an event pointer 
which points to the adding event. Funtion getfromintq() returns the earliest event with 
pointer. The data structure of the general event is defined as following. 
typedef struct ev { 
   double time; 
   int type; 
   struct ev *next; 
   struct ev *prev; 
   int    temp; 
   char space[SANSIM_EVENT_SPACESIZE]; 
} event;  
First, the function caller allocates an event structure and assign the event_time, 
event_type and other informative fields, and then call addtointq() with the address of 
the event to add the event to the internal event pool. The addtointq() uses event_time to 
determine the position of the new event. The event_next and event_prev are used to 
maintain a double link list. Getfromintq() is responsible for updating sansim->intq and 
returning the earliest event for processing. If there is no event in the pool, it returns a 
NULL event, and the caller takes the responsibility to handle the NULL event. 
The events are continuously being processed until it reaches the stop condition. 
During the execution of the processing event, one or more new future events may be 
issued and inserted to the global event pool. Following pseudo code illustrates the 





 Simtime= newEvent->time; 
     processX ( nextEvent){ 
  … 
  newevent construction; 
  addtointq(newevent); 
40 
} 
} while (stop condition) 
The process executes the event predefined functions according to the 
information passed by the event structure. Typically, after the execution, one or more 
future events are generated and added to the general event pool. 
For the convenience of looking up, we design an information structure to hold 
the start addresses of similar components. The address of the information structure is 
stored in SANSim global data structure. With given ID and other information from the 
arrival event, we are able to retrieve the data structure to form the information structure. 
For example, we define following information structure for simple_disk (see Figure 
4.4). 
typedef struct simpledisk_info { 
   struct simpledisk  *simpledisks; 
   int numsimpledisks; 
} simplediskinfo_t; 
The SANSIM structure has a pointer of *simpledisk_info pointing to the 
simplediskinfo_t data structure which is allocated during the initialization. During the 
initialization, the number of simpledisk is determined by the given configuration file 
and the initialization process allocates a continuous space for the number of the 
simpledisks from the reserved memory area and stores the start address to the member 
of “simpledisks”. From the arrival event, we know the component type is a “simple 
disk”. And from the “device No”, we are able to retrieve quickly the data structure 
from the simpledisk_info. With the component’s data structure and the event, we are 
able to determine which function to use. 
41 
 
Figure 4.4: SANSim Simple Disk Memory Allocation 
4.1.3 Modeling of System Components 
The system components are modeled as a set of functions to respond to a 
particular event and to transit from one state to the other. The collection of the status 
and related parameters of components forms a data structure. As an example, a device 
driver data structure is a collection of parameters such as the overhead of the drive to 
execute an I/O command and the status of the driver such as the number of I/O in the 
system, whether it holds the hardware link etc. The data structure is not a completed 
simulation model. The behavior of the component is modeled by a series of functions 
or routines. These functions or routines execute upon certain events at a particular time. 
The information embedded in a particular event identifies which data structure to be 
accessed. The event type and the status information stored in the data structure 
determine a particular function or routine to be involved. Figure 4.5 illustrates the 
concept of modeling a system component. From the unique arrival events, we are able 



































Future Event Post to The intq
* List Head
Data Structure












4.1.4 Integration of Network Module to the “Bus” Structure 
The overall message passing along various components is done by a structure 
pair of buspath and slotpath. The buspath is to specify each “bus” along which the 
message is traveling from source to destination, while the slotpath is to specify the 
entry and exit slots of corresponding “bus” (see Figure 4.6). The buspath and slotpath 
are implemented as 128 bit wide strings as follows. 
typedef struct _u_int128_t { 
 u_int32_t v4; 
 u_int32_t v3; 
 u_int32_t v2; 
 u_int32_t v1; 
} u_int128_t;  for slotpath or buspath 
From this, we have 16 Bytes to support 8 stages of the interconnection. Each stage has 
2 Bytes to address maximum 256 buses. The entry slot and exit slot share the 2 Bytes 
storage in the slotpath variable. Thus, it addresses 16 slots for 1 Byte. 
 
Figure 4.7: Buspath and Slotpath 
 
For example, Figure 4.7 shows a buspath and slotpath pair for the message 
passing between the device connected to “bus A slot A1” and the device connected to 
“bus D slot D2”.  
After initialization, the simulator constructs a set of buspath and slotpath pairs 
for identifying and routine information from I/O drivers (sources) to storage devices 
(destination). This pair of route information is specifically designed in the simulator 
and it is not necessary corresponding to a real interconnection structure. Most 
44 
commonly, a real SAN uses a complex network (FC network or IP network dedicated 
for storage) for the interconnection.  Nevertheless, we use this “bus” and “slot” pair to 
identify and route message among participating devices, being a real bus or a virtual 
one. 
For a network connection, what we most concern is those several entry and exit 
points connecting to the controller or storage device. In the first place, we use a bus 
with several slots to represent this network for the purpose of above mentioned routing 
information. On the other hand, the bus is specified as a special bus type. When 
message is passed to the bus, it triggers the network module to do the network 
simulation. After delivering the message, the control comes back to the “bus” at the 
exit slot. 
  
Figure 4.8: FC Network in SANSim 
Let’s consider a situation shown in the figure below. As shown in the figure, 
the dev2 and dev3 are connected to slot #2 and slot #3 of bus #3. A FC target HBA 


































FC initiator HBA and the FC target HBA. The bus#1 and slot#1 is the input bus to the 
FC initiator HBA. To access dev #2 from device driver, it needs a buspath of  
01 02 03 -1 -1 -1 -1 -1 
 
and slotpath of  
00 01 xx xx 01 02 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 
 
Based on the routine information, the message like I/O request travels to the FC 
initiator. From the controller, the source ID can be retrieved from the initiator’s data 
structure, and the target device gets from the next level of bus, bus #03. From bus#03 
and slot #01, we get the target FC controller data structure and the target ID. With the 
source ID and target ID ready, the communication in the network can be simulated. 
4.2 I/O Workload Module 
SANSim supports both system-traced I/O workload and synthetic I/O workload. 
This section focuses on the synthetic workload generation [33, 34 35, 36, 37]. 
4.2.1 Request Parameters and Workload Patterns 
Generally speaking, a disk request is defined by five values: device, operation, 
location, size and arrival time. The workload module is able to generate several 
different arrival patterns such as Poisson arrivals, equal time intervals and so on. It can 
also generate arrival times to describe the situation that at different time periods the 
requests come in at different rates. Another capability of the workload module is to 
generate repeatable requests with the repeatability specification. This is to describe the 
situation of  some files being more popular than others in the storage system.  
A unique request includes request ID, device number, starting LBA address, 
size and read/write action for each request. The workload generation module first reads 
46 
the parameter file and then generates all unique requests according to the specifications 
in the parameter file.  
After generating all unique requests necessary, the workload generation module 
generates final requests needed for the simulation. The requests are generated one by 
one until the last request. 
4.2.2 Parameter File 
Users have to specify the following information in the parameter file to 
generate workload: 
• Initial seed number  
• Total number of requests,  
• Total unique request number,  
• Device number,  
• Request rates,  
• Read request percentage,  
• Size distribution,  
• Repeatability of requests.  
The initial seed number is negative integers which specify the starting point of 
the random number sequence. If it is not specified, the program uses computer time to 
be as a seed to generate random numbers.  
The device number information is given by listing all the device numbers in the 
parameter file. The Workload Generation Module randomly chooses a device number 
for each unique request. 
The request rate is an integer in the unit of number of requests per second. If 
the request rate changes at the different time period of the simulation, the time period 
and its corresponding request rate have to be specified in the parameter file. 
47 
The operation information is given in percentages of read. The operation can 
only be either read or write. 
The size is given in number of blocks. If there are more than one request size in 
the simulation, the number of requests and its corresponding size have to be specified.  
Different unique requests may repeat different number of times in the final 
workload. Therefore the repeatability of requests is given by the unique request ID and 
its corresponding number of times appeared in the workload. 
4.2.3 Implementation of the Workload Generation Module 
There are two main steps to generate block level requests as shown in Figure 
4.9. The first step is to generate unique requests. After all the unique requests are 




Figure 4.9: Two Main Steps to Generate Repeatable Requests 
 
48 
(a) Unique Request Generation 
Generating a unique request means to generate request ID and other associated 
parameters including device number, starting LBA address, size and read/write action 
for each request. The workload generation module first reads the parameter file, and 
then generates all unique request accordingly.  
Request ID is an integer generated sequentially for each request with starting 
value of 1. The biggest ID number is the number of the unique request specified in the 
parameter file.  
Device number is where the document is stored in storage system. If there is 
more than one device, the device number is chosen randomly for each document. Thus 
each device ends up with holding almost equal number of documents if the number of 
document is large enough. 
Starting LBA address in blocks for each document is given sequentially starting 
from block 1 of each device. 
Size in blocks is generated for each request according to the specifications in 
the parameter file.  
Read/write action: Actions are either read or write for all the unique requests. 
The action is generated for the request according to the percentage of read requests 
occupied in the workload generation file. 
 
(b) Generate Final Block Level Workload 
After generating all unique requests, the workload generation module generates 
final workload for the simulation. The requests are generated one by one until the last 





Select a unique 
request
No
Form a block-level 
request
End





Figure 4.10: Generate Block Level Requests for Simulation 
 
Step 1: Generate arrival time for a request. There are three types of arrival time: 
equal interarrival time, normal distribution interarrival time and Poisson arrivals. The 
arrival time can also be generated to describe the situation that at different time periods 
the requests comes in at different rate.  
Step 2: Select a unique request to be referred for the request. The unique 
request may have different repeatability meaning that they may be referred different 
number of times during the whole simulation. Step 2 is actually to choose a unique 
request according to the repeatability specification.  
Step 3: Form the request. Detailed request information is written to an output 
trace file which consists of arrival time, device number, address, size and action.  
50 
4.3 SANSim Host Module 
The Host Module (see Figure 4.11) mainly consists of I/O driver, system bus, 
HBA and upper layer. The I/O driver interacting with upper layer schedules and 
manages I/O requests from either synthetic generation or re-play of a trace file. The 
system bus transmits message from one device to the other with certain latency. The 
HBA transmits and receives messages between the host system and the target system. 
  
 
Figure 4.11: SANSim Host Module 
4.3.1 The Upper Layer 
The upper layer including the CPUs, Main Memory, Interrupt Controller, 
Process Control, System Clock, is modeled abstractly. The details of these modules are 
described separately, since we focus on the network modeling and the storage system 
performance simulation. The data structures (see Figure 4.12) have been developed to 
handle the activities mentioned above. Every CPU has CPU events. And CPU events 
have processes and interrupt which link with their own events. 
N e t w o r k
C o n t r o l l e r I / O  C o n t r o l l e r
M a in  M e m o r y
A p p l i c a t i o n s
O p e r a t i n g  S y s t e m
C P U C P U. . .
D e v ic e  D r i v e r s
S y s t e m  B u s
B u s  A d a p t e r / I n t e r r u p t  C o n t r o l l e r
I / O  B u s
51 
 
Figure 4.12: Data Structure in SANSim Host Module 
4.3.2 I/O Driver Module  
The I/O device driver interfaces with the upper layer which is either the host 
system portion of the simulator or the workload generator. I/O requests arrive via a 
unique interface function and are placed in a device queue. The I/O requests in the 
queue are scheduled for accessing underlying storage device (see Figure 4.13). 































Figure 4.13: I/O Drivers Data Structure 
 
Following function sets are designed to handle various events issued during the 
simulation: 
IO_REQUEST_ARRIVE:   iodriver_request()  
IO_ACCESS_ARRIVE:   iodriver_schedule() 
IO_INTERRUPT_ARRIVE:  iodriver_interrupt_arrive() 
IO_RESPOND_TO_DEVICE:  iodriver_respond_to_device() 
IO_ACCESS_COMPLETE:  iodriver_access_complete() 
IO_ACCESS_REJECT:   iodriver_access_reject() 
Sub queue
Controllers

































IO_ACCESS_RETRY:   iodriver_access_retry() 
IO_INTERRUPT_COMPLETE:  iodriver_interrupt_complete() 
 
The I/O requests “arrive” through event IO_REQUEST_ARRIVE at certain 
simulation time. The function iodriver_request() handles this event. First the I/O 
request embedded in the event structure is analyzed and updated. Based on the 
different given configurations, the iodriver_request() may internally call 
iodriver_schedule() directly. Typically, the iodriver and the underlying storage 
subsystem are modeled as a close system. The iodriver is only allowed to issue a 
maximum number of requests to the corresponding device. If the outstanding request 
number has not yet reached the maximum, a new access could be scheduled. The 
iodrier_schedule() is then called to handle this new access to the device. 
On the other hand, another given configuration may require generating a new 
event call IO_ACCESS_ARRIVE. The event passes to the iodriver_schedule() 
function to schedule the access to the attached device. 
The rest of the functions listed above are to handle various events (e.g 
interrupts events etc) to simulate the behavior of the I/O handling. 
4.3.3 Bus Module 
The module provides a shared common communication module for the 
interconnections. It supports both shared bus type and exclusive bus type. Similarly, 
the SANSim global structure holds a bus_info pointer which points to the information 
structure. From the information structure, given a bus number, we can retrieve the bus 
structure. The bus structure holds information about the bus, such as bus type and bus 
bandwidth etc, as well as the slots pointer pointing to the slots array which holds the 
information about the device connected to the slot. Basically, the slot structure records 
54 
the device number and device type of the connected device. Figure 4.14 shows the 
structure relationship. 
 
Figure 4.14: Bus Structure 
 
4.3.4 Controller and Adapter (HBA) 
Currently, a pass-through simple controller could be modeled and perform 
simulation with some fixed overhead. With supports of queue sub-module and disk-
block-cache module, a more complex type of controller could perform request 
command arrival queuing and caching. A sequential analyzer module is developed as 
an intelligent part to separate sequential I/O streams from a mixed workload for 
specified HBA, such as Fibre Channel HBA. The controller data structure holds 
reference to the controller structure array. 
4.4 SANSim Network Module 
The key function of the FC network module is to simulate the FC connectivity, 
topology and communication protocol. The FC network module includes three sub-
modules: the FC_controller module, the FC_switch module and the FC_port & 
Bus# 0,1...






































FC_communication module. The FC_controller module simulates the communication 
behavior of FC commands or data frames. The FC_switch module models all the FC 
ports, switch architecture, and as well as the routing and flow control. The 
FC_communication module transfers FC frames between the FC ports. 
4.4.1 Network Simulation 
Given a connection topology and communication handshaking protocol, we 
implement the event driven simulation module to simulate the time behavior of the 
message transmitting over the network. The source and target as well as the message 
length are known at run time, the simulator is to calculate the network latency for this 
message. The connection topology may be complex, involving multiple stage routing. 
Currently, we have implemented the FC for SCSI protocol for FC Arbitration Loop, or 
Fabric connection.  
Three sub-modules are FC_controller module, FC_switch module and FC 
communication (FC_COM) module (see Figure 4.15). The FC_controller module 
generates FC frames for exchanging SCSI commands and data. FC_switch module 
models all the FC ports, switch architecture, and as well as the routing and flow control. 
The FC_COM module transfers FC frames between the FC ports. 
56 
 (a) Abstracted view  
(b) Simulation modules 
Figure 4.15: Modeling of Fibre Channel Network in SANSim. 
The interactions and supporting relationships of these three modules are 
depicted in Figure 4.16. FC_controller module supports both initiator and target 
operation. It consists of a bus interface to connect to the host, a FCP engine to do the 
FC framing for SCSI, and the FC port to communicate with other FC devices. The key 
function of FC_switch module is to direct and forward FC frames to destination port. It 
includes a routing function, a cross-bar for forwarding FC frames, and several FC ports. 
The FC communication module is to schedule and manage incoming and outgoing 






















connection are supported in the communication module, the one-to-one connection and 
the arbitration loop connection. The flow controls of the connection are implemented 
as buffer-to-buffer credit for one-to-one connection and alternative BB credit for loop 
connection. The link management is to handle the ordered set and primitive signal such 
as Idles and R_RDY signal. 
 
Figure 4.16: FC Network Modules and Relationship 
 
4.4.2 FC_controller Module 
(a) FCP Operation 
FC_controller module models the behavior of the FC HBA both as initiator and 
target.  As shown in Figure 4.16, the FC controller module includes three sub-modules: 
Bus_interface, FCP_engine and FC_port. The Bus_interface sub-module handles the 
communication between device driver and the controller such as DMA and 
interruption. The FCP_engine takes responsibility of constructing different FC frames 
58 
corresponding to each sequence of FCP exchange. The FC_port takes responsibility of 
delivering an FC frame to the destination port on behalf of the communication module. 
When a SCSI request arrives in the initiator, the device driver sends SCSI 
commands to FC_controller through the Bus_interface. The FC_controller then 
executes the command and fetches SCSI I/O’s information from the memory. An 
FCP_CMND frame for each SCSI I/O is constructed in FCP_engine, and then is sent 
out through FC_port module. 
After the target FC_controller receives the FCP_CMND, the FCP_CMND is 
sent to target's host memory through the Bus_interface. Then the target device 
driver/firmware decapsulates the FCP_CMND and processes the SCSI command. 
For a FCP Read operation, the target host decapsulates the SCSI request and 
passes it to the corresponding memory/SCSI device. Once the data is ready, the target 
FC_controller takes the responsibility of transferring data back to the requestor (the 
initiator) and send the completion message of FCP_RSP upon completion of sending 
data. For a FCP Write operation (SCSI Write 10), the target driver allocates sufficient 
memory area for the incoming data up to the maximum allowed FCP XFER Burst 
Length, and sends FCP_XFER_RDY to the initiator for requesting the data. When the 
initiator receives the FCP_XFER_RDY, it starts FCP_DATA sequences. Finally when 
the target receives all data successfully, it sends a FCP_RSP indicating the completion 
of the FCP operation.  
 
(b) DMA Operation 
The FC controller receives a frame from incoming frame buffer, and de-
encapsulates the frame (see Figure 4.17). If it is a FCP DATA, the data frame is passed 
to the specified memory area directly. Otherwise, the frame is passed to the single 
59 
frame queue and interrupts the host system for attention. For outgoing frame, the FC 
controller gets an operational command through register map from the device driver. 
When it is an I/O command, the controller retrieves the CDB and related information 
through DMA from the host and constructs a FCP CMD frame and places into the 
outgoing frame buffer queue. Possibly, the FC controller needs to send a bigger 
segment of data from the host memory. In this case, the FC controller passes couple of 
frame lengths of data from the host memory and constructs FCP DATA frames and 













































Figure 4.17: DMA Operation 
4.4.3 FC_Communication Module 
The FC_communication module is to model Fibre Channel frame transmission. 
Based on the communication protocol, we first describe the FCAL module structure. 
Then we describe the point-to-point communication module. After that, we introduce 
fabric switch structure. To put them all together, we present the global addressing 
scheme and the integration with bus structure for different layers of network. 
60 
 
(a) FCAL Simulation 
To simulate FCAL loop behavior, we designed a loop object to hold all 
participating L_ports and the overall information about the loop such as the loop’s 
media speed (1G or 2G). The L_port is actually in a FC controller, however the loop 
object simplifies the management effort for the L_port. The data structure for the 
FCAL is shown as in Figure 4.18, and FCAL operation is shown in Figure 4.19. 
 
Figure 4.18: FCAL Data Structure 
For each L_port, it receives incoming order sets and transmits related order sets 
to maintain the loop communication. The Loop Post State Machine for each L_port is 



















Figure 4.19: FCAL Operation 
 
 























































































L_ rt Sub-module 
62 
 
Figure 4.21: The Edge Event of Continuous Order Sets 
Simulating the edge event of continuous order sets to save computing power is 
one of the measures implemented in the SANSim (see Figure 4.21). Port 1 may start 
transmitting OS at time t. CTW and CTWtime then record the OS type and the time, 
respectively. After the propagation delay (5ns/m), the OS reaches at Port 2, and the 
port starts receiving the OS. CRW and CRWTime of Port 2 are then updated. The 
CTW (CTWTime) & CRW (CRWTime) remain unchanged until they reach the edge 
of the signal. In this way, the simulation only records the change edge of OS signal, 
instead of transmitting same OS every 38ns which generates a great number of events 
and make it impossible for simulation with limited computing power. 
When a L_port control request comes to a port, the state of the port may change 
and cause CFW to be changed. Some of the state transition may only latch on next 
coming event. In that case, a self triggering process helps to activate a new incoming 











CRW: current receiving word
CRWTime: time of start receiving CRW
CTW: current transmitting word
CTWTime: time of start xmiting CTW




For sending a frame, a L_port event indicating SOF is issued and the port is 
blocked for not sending any other word until the time for transmitting all data of the 
frame, and EOF is then transmitted and the block is cleared. 
 
(b) Point-to-point communication 
Besides the loop connection, the other type of communication mode is point-to-
point pair. The point-to-point communication is not necessary same as the point-to-
point topology. The point-to-point communication in term of topology is referring to 
the FCP connection, one FCP initiator talking to one target. Our point-to-point 
communication addresses the communication between two FC ports. Considering the 
connectionless communication, and using buffer-to-buffer communication, the fabric 
connections can be treated as multiple pairs of point-to-point connections. Figure 4.22 
























FC Port A FC Port B  
(b)BB Credit Flow Control Between Two FC Ports 
Figure 4.22: BB Credit Flow Control in FC Connection Sub-module 
64 
(c) FC Switch Fabric 
SANSim FC switch module has two sub-modules: multiple FC ports and one 
FC Switch core (FC_SW), as shown in Figure 4.23(a). FC ports contain 
F_ports/FL_port and E_ports; F_port/FL_port is for host-switch and device-switch 
connections and E_port is for inter-connection between switches. The FC_port’s 
Address_ID is unique and well confined to the FC-SW-2 standard [26]. FC_SW is the 
switch’s control center for frames routing and forwarding. It contains routing and 
internal cross-bar, as shown in Figure 4.23(b). If the destination port of requested FC 
frame is busy, the incoming frame remains in the incoming buffer until it is 
successfully routed. SANSim uses Dijkstra’s algorithm to compute the shortest path 
for the routing. The routing table remains unchanged unless the network connectivity 
is changed during the simulation. When the network connectivity is changed, the 
switch module re-computes the shortest path. 
FC-SW
FL-Port F-Port E-Port









(a) FC_switch Module (b) FC_SW Sub-module 
Figure 4.23: FC_Switch Modeling 
4.5 SANSim Storage Module 
The storage system module supports following configuration 
65 
• RAMDisk: This device has fixed access time for each request. The 
access time can be specified as a random value according to certain 
probability distribution. 
• Hard Disk Drive (HDD): This module models the mechanism behavior 
such as seek time and rotational latency as well as the disk buffer cache 
behavior. 
• Disk Array: Multiple disks can be grouped to form a single storage 
device to the upper layer. Currently, it supports RAID 5 and RAID 0 
operations. The degrade mode operation and rebuild can also be 
supported. 
 
4.5.1 RAID Array Module 
The RAID array is the module used in SANSim to represent a complex storage 
system.  The RAID array contains its own I/O controller, interconnections and storage 
devices. The storage device inside the RAID array can again be a RAID array. In 
Figure 4.24, when the RAID array is configured to be RAID 5 disk array, multiple 
dimensions RAID 5 are formed. 
66 
  





1st Level RAID 
2nd Level RAID 
 
 




Features of RAID Array module are as follows: 
• Can be run independently 
• RAID can be cascaded  
• Can be integrated with new RAID algorithms 
• Detailed simulation of processing, internal system bus, cache management,  























Figure 4.25: Functional Block Diagram 
 
RAID array module consists of five major sub-modules, as shown in Figure 
4.25. The general queue/scheduler modules are used. The front bus interfaces have 
different activities according to the use of the bus. As mentioned in the earlier section, 
the bus here could be a complicated network such as a FC fabric. The I/O requests are 
received through the interfaces and requested data are transferred to/from the bus 
interface as well. The Request Processing sub-module acts as a coordinator to serve the 
request. It first passes the request to the Data Cache sub-module to check if the 
requested data is in the cache. When a miss happens and accesses to disks are required, 
the Request Processing unit involves RAID Algorithm to map these request from the 
virtual address space to the actual block address to each disk accessed. The redundancy 
dependencies such as the read modified, write dependencies are preserved.  Finally, 







Retrieve a request 
from queue, 
NumOSReq++




























Figure 4.26: RAID Array Queuing Module  
 
The RAID array controller can be viewed as a queuing system that maintains 
an incoming command queue and several outgoing queues dedicated for each of the 
underline storage devices, as shown in Figure 4.26. The general queue module 
supports all these queues. The I/O requests received are first put in the incoming 
command queue. The process retains commands from the command queue following 
certain policies. The incoming command queue allows some optimization to be done to 
the waiting command, such as combining several sequential I/Os. Additionally, when 
the process unit is busy, the queue provides a buffer stage for the command. The 
outgoing queues hold I/O requests to each of the hard disk drive. In the case of disk 
drive that supports command queuing, the RAID ARRAY controller may have 
knowledge of the number of command allowed to be outstanding. The outgoing queue 
holds the information about the outstanding requests.  The rest of the requests wait in 
69 
the outgoing queue until outstanding ones are completed. Similarly, some optimization 
can be done on these waiting commands. 
4.5.3 RAID Algorithm Process  
The RAID array model supports various schemes of mapping and redundancy 
combinations.  
 
(a) Address Mapping 
The mapping scheme is the way of translating the address of a request to the 
block address of its member disk. In some situation like Just Bunch of Disk (JBOD), 
accesses to the disks are directly addressed to the disk’s blocking address. That is the 
case of “no mapping”. In ideal case, the requests are evenly distributed to every 
member disk. For evaluation purpose only, the “ideal” mapping scheme directs the 
external requests to every member disk in round robin. The random scheme picks a 
member disk randomly instead of “round robin”. The actual block addresses for these 
two cases are ignored. More often, the disk arrays are organized in the way of 
“striping”. The virtual address is striped across the member disks, i.e. the data are 
written (and latter read back) to multiple disks for better performance. To get the 
optimal performance, the fraction of reads or writes to each of the member disks shall 
be same. It is not possible to get such optimal solution due to the blocking factor. For 
the evaluation purpose, we may still conduct simulation with stripped_unit_size=0, 
which means a fine grants striping.  
(b) Redundancy 
The redundancy scheme refers to the method to save the data in the presence of 
disk failure. Without any redundancy, as the option of “no redundancy” appeared in 
JBOD too, the data will be lost. Another option is to write same data to multiple disks 
70 
and later read from anyone. The method is called as replication. For higher fault 
tolerance, the more replicas are required. Another option is to save only the parity of 
data written to other disks. The parity is computed by XOR operation. In the presence 
of one disk failure, the unavailable data (originally stored on the failure disk) can be 
recovered by the parity and other available data. The parity can be stored to a fixed 
disk, or rotationally to all member disks. The combination of striped mapping and 
parity redundancy often results in a way of parity table scheme. A table is established 
as a means of reference. Each cell of the table records the device number and the 
relative block number according to the striped unit size and the parity placement 
scheme.  When mapping is required, the entry number to the table is first computed 
from the virtual logical address. The destination device number and block address are 
determined from the table entry. 


















Return number of requests
Access to the hard disk drive
Access Completion Report













Figure 4.27: Functional Block Diagram of the RAID Algorithm Process 
 
71 
Figure 4.27 illustrates the functional flow of the RAID algorithm process. The 
original request is validated first and then a record is created to hold the information 
about the request. The record is labeled with a unique id that is used as key to the 
record’s hash table. The request is passed to one mapping function determined by the 
configuration parameter to produce one or more requests accessing to member disk 
drives. The “parity table” scheme is used for two cases.  In one case, when the 
mapping is “striped” and the size of stripe unit is not zero, and the redundancy scheme 
is fixed parity disk, the initialization routine change the redundancy type to “parity 
table” and mapping scheme to “no redundancy”. A table with only one stripe for 
address mapping is created to serve the purpose. The other case is when the “rotated 
parity” redundancy scheme and the striped with non-zero stripe unit are specified. In 
such case, the initialization routine changes the mapping and redundancy in the same 
way and the rotational table is created. Each redundancy process determines additional 
I/O requests required by the combination of the mapping scheme and the redundancy 
scheme. Some of these I/Os may need to be performed prior to the others. A 
dependency relationship between these I/O requests is created and stored with the 
record.  At the end of the redundancy process, all I/O requests are finally analyzed and 
several small sequential accesses are combined to form a larger I/O. The final list of 
I/Os is returned to the media access scheduler. The attached hard disk drive eventually 
receives and serves these requests. Upon completion, the RAID array receives a 
completion message. The completion message carries the information about the 
dependent requests, including the record id. The redundancy process first uses the id to 
receive the request’s record from the hash table, and determines if there are any 
requests on hold waiting for this completion. If so, the waiting requests are returned. 
72 
Otherwise, the completion may indicate the completion of the whole I/O. The 
completion routine must then be evolved. 
 
4.5.5 Summary of Supporting Schemes  
The combination of the mapping and redundancy schemes currently supported are 
shown in Table 2.  





Mapping Ideal Random 
Striped 
Unit size =0 
Striped 
Unit size>0 
No Redundancy √ √ √ √ √ 
Replications √ √ √ √ √ 
Fixed Parity Disk √ √ √ √ √ 
Rotated Parity √ √ √ √ √ 
 
4.6 Capabilities and Limitations of SANSim 
SANSim has following capabilities: 
• Support SAN simulation and design 
• Support FC frame level simulation 
• Support storage subsystem simulation 
• Support SAN algorithm and architecture research  
SANSim also has following limitations. We will address those issues in future 
work. 
• Does not support GUI 
• Does not support other storage protocol (e.g. iSCSI, iFCP etc) 




This chapter gives an overview of SANSim development and its four main 
modules: I/O workload module, Host module, Network module and Storage module. 
We also discuss some implementation issues, e.g. memory allocation, data structures. 







Chapter 5   Experiments and Validation 
Sophisticated FC SAN simulation requires accurate and detailed FC 
transmission specifications, including information on command processing and 
protocol overheads etc. The accuracy of the extracted information is demonstrated by 
comparing the behavior of SANSim simulation against actual SAN activity. 
5.1 FC SAN Transmission Specifications 
We use Finisar GTX Fibre Channel Analyzer to track the actual FC 
communication in SAN (see Figure 5.1). Qlogic 2300 FC HBAs are used both in the 
initiator and the target. FC analyzer monitors the transmission of FC frames.  
 
Figure 5.1: Real System Overhead Collection Configuration 
Figure 5.2 shows an example trace extract from FC analyzer. The FC analyzer 
can capture 100% traffic in 2 Gig FC (full duplex), and time interval accurate to 
75 
nanoseconds. Each event in the trace starts with a timestamp (nanosecond) that 
indicates the beginning time of the transmission. Next field is a record of the same 
Ordered Sets or a frame transmitted. If the transmission is Ordered Set, the OS type 
with parameter such as OPN (x,y), and the OS counts are presented. If it is a frame, the 
FCP type and frame size as well as other frame header information are shown.   
 
 




Figure 5.3:  Example Trace of Simulated Port Transmission 
 
In order to calibrate the FC transmission, SANSim simulation modules can 
print all event information in readable format.  Figure 5.3 shows an example of 
76 
SANSim FC frame transmission print out, which follows same sequence as the actual 
trace shown in Figure 5.2. Through examining these events, the correctness of the 
protocol implemented in the simulation models is validated. For some examples, the 
six-ordered-sets-gaps requirement between frames is fulfilled; the R_RDY, OPN, CLS 
etc signals are sent only once; the R_RDYs are transmitted at least two fill words prior 
to and following by; the LPSM state transition is verified; the Alternative BB credit 
flow control logic is tested; the FCP transaction protocol is followed. 
5.2 FC SAN Overhead 
Based on the analysis on real FC analyzer traces obtained during the model 
validation, we set HBA’s overhead and control constant, as shown in Table 3 and 
Table 4 for Initiator and Target, respectively. It is noted that the HBA has command 
execution overhead of 43.9 microseconds, which theoretically results in 22K IOPS I/O 
processing capacity. Following most industry implementation, the login guaranteed 
buffer credit is set to zero. The DMA transfer bandwidth between the HBA and the rest 
of the system is set to be 1046 MB/s corresponding to the 133 MHz 64 bits optimal 
bus speed. However, the overhead of the DMA scheduler is set to be about 15 
microseconds. At the mean time, the DMA scheduling policy is set to be FCFS. 
 
Table 3: System Initiator HBA Overhead & Control Constant 
Incoming Frame Buffer 3 
Login Guaranteed Buffer  0 
Maximum Frame Size 2048 Bytes 
Full Duplex No 
Command Execution Overhead 43.9 microsecond 
FCP_XFER_RDY Handling Overhead 11.8 microsecond 
DMA Bandwidth 1064 Mbytes per second 
DMA Scheduler Overhead  15.5 microsecond 
77 
DMA Round Robin No 
Incoming Frame Processing Overhead 400 nanosecond 




Table 4: FCP Target Overhead & Control Constant 
Incoming Frame Buffer 2 
Login Guaranteed Buffer  0 
Maximum Frame Size 2048 Bytes 
Full Duplex No 
Command Execution Overhead 47.4 microsecond 
FCP_RSP Generation overhead 10.3 microsecond 
DMA Bandwidth 1064 Mbytes per second 
DMA Scheduler Overhead  15.5 microsecond 
DMA Round Robin No 
Incoming Frame Processing Overhead 400 nanosecond 
Single Frame Transfer Overhead 20.2 microsecond 
 
5.3 Experimental Environment 
The experiments are conducted on a FC-AL configuration with one window’s 
initiator and one storage target (see Figure 5.4). Qlogic 2300 FC HBAs are used both 
in the initiator and the target. Microsoft Window XP Professional SP1 is installed on 
the initiator system together with the HBA initiator device driver from Qlogic. The 
target driver is developed by Data Storage Institute, Singapore (DSI). [38, 39, 40, 41] 
The software maps all storage I/O to the memory rather than to an actual magnetic disk. 
Since we are focusing more on the FC-AL connection, using RAM disk as a target 
helps to isolate problems from modeling of actual hard disk drive. Table 5 lists out the 
detailed hardware and software configurations in the experiments. IOMeter [52], 
78 
widely used as the industry standard benchmark tool, is used here to collect I/O 






Figure 5.4: Simulation Configuration 
 
Table 5: System Configuration for SANSim FC-AL Module Validation 
Initiator 
Hardware CPU: AMD AthonMP 1600+ 
FC HBA: Qlogic 2300 
RAM: 2x256MB DDR SDRAM  
Mainboard: 64 bit PCI Tyan Tiger MP2466N  
Software OS: Windows XP Professional SP1 
Driver: Qlogic Driver Version 8.1.5.12 
Tool: Intel IOMeter Version2003.02.15 
Target 
Hardware CPU: Intel PIII 1GHz FC HBA: Qlogic 2300 
RAM: 4 x 1GB Kingston ECC Reg. PC133 
Mainboard: 64bit PCI,  Supermicro 370  
Software OS: RedHat 8.0 Kernel: 2.4.18 
Driver:  In-house 2300 target driver Ver 1.0, 
In-house Linux RAM Disk Ver 2.0 
 
5.4 Performance Metrics and Parameters 
The performance parameters include IOPS (I/O per second), Throughput, 
Queue depth, I/O Request Size, and Read/Write Operations.  
79 
• IOPS: how many I/O requests the system can handle. Normally it refers to 
small I/O request. 
• Throughput: storage system performance (MB/s) 
• Queue depth: the number of outstanding I/O requests, which are injected 
into the FC network and storage system.  
• I/O Request Size: request size for I/O operation 
• Read/Write Operations: read request/ write request for the I/O 
IOMeter issues a number of requests (equals to the queue depth) initially, and 
generates following I/O requests strictly upon the completion of previous requests.  
Fixed I/O sizes are used in all requests.  



















Figure 5.5: SANSim Simulation Model Configuration 
 
SANSim simulation model configuration is shown in Figure 5.5. I/O workload 
is generated by IOMeter software in real system. Host model represents the initiator 
80 
system. Network model here is FC_AL configuration. And storage module is the RAM 
Disk module. 
5.6 Comparisons of the Testing Data and Simulation Data 
Figure 5.6(a) and (b) show that I/O transaction performance varies with the 
Queue depth for read and write operations. The I/O sizes are set as 2KB, 8KB, 16KB, 
and 32KB, respectively. The IOPS increases with the Queue depth until the saturation 
limit. In other words, IOMeter reads the maximum performance of the network and 
storage system after Queue depth has exceeded the limit. For example, when I/O size 
is 8KB and IOMeter sends 100% read operations, the maximum transaction 
performance is 17.5kIOPS. That means an overhead of network bottleneck is around 
0.057ms (1second/17.5k). Further analysis shows that the average response time is 
about 0.471ms when Queue depth equals to eight.  In other words, most of the 
response time is spent in the queue time at the network bottleneck.  
Figure 5.6 shows how the IOPS and throughput (MB/s) vary with the I/O 
request size for read/write operations. The queue depths are set as 1, 2, and 8 
separately.  Generally, IOPS decreases with the increase of I/O request size, but the 
throughput will increase. The IOPS remain constant over very small I/O request size 
(from 1k to 4k), when queue depth is 8. This is due to the fact that the I/O commands 
processing time overwhelms the data transfer time for small I/O. As the request size 
increases, the data transfer time also increases. When the request size is large enough 
(more than 128KB with Queue depth of 8), the data transfer time dominates the overall 
overhead. Then the throughput is limited by the FC network bandwidth.  
81 



































(a) Validation of read transaction performance on  fixed I/O sizes when queue depth 
increases 


































(b) Validation of write transaction performance on  fixed I/O sizes when queue depth 
increases 
































(c) Validation of read transaction performance on fixed queue depths when request 
size increases      
 
Figure 5.6: IOPS and Throughput for R/W Operations with I/O Request Size 
 
82 

































(d) Validation of writes transaction performance on fixed queue depths when request 
size increases 































(e) Validation of read throughput performance on fixed queue depths when request 
size increases 































(f) Validation of write throughput performance on fixed queue depths when request 
size increases 
 
Figure 5.6: IOPS and Throughput for R/W Operations with I/O Request Size (cont.) 
83 
The simulation results and experimental data shown in the figures are 
demonstrated to match very well for all cases. The error range is mostly less than 10%. 
For read operation simulation, it is less than 3%.  The detailed cache algorithm used in 
Qlogic FC HBA is very complicated. It is optimized for their hardware. In SANSim, 
we use LFU cache algorithm which is most used in modern operating system. For 
write operation, cache effect is more obvious than read operation. 
5.7 Summary 
The simulation model has been calibrated and validated from three different 
aspects. From the FC signal transmission level angle of view, the model has been 
verified by checking signal transmission events against the actual FC analyzer’s traces. 
From the general I/O performance trends point of view, it has been proven that the 
simulation model outcome agrees well with the expectation. We have demonstrated its 








Chapter 6   Application of SANSim 
The Core/Edge fabric is the most frequently deployed topology in current FC 
SAN due to its scalability. The network based on core/edge topology is symmetrical; 
every device has an equivalent path to other devices connected to FC network. One 
problem of the Core/Edgetopology is that when one of the inter-switch-link (ISL) fails, 
the network becomes asymmetrical. SAN designers may be concerned how the 
network performance is affected by the link failure in terms of sustainable throughput 
and network latency. The effect of the failure location on the network availability and 
performance is another issue, especially when the network size is large and the 
probability of the link failure is high [21, 42, 43, 44, 45]. In this chapter, we study the 
performance degradation issues of FC SAN caused by the ISLs failure using SANSim. 
85 






Figure 6.1: Core/Edge Fabric 
The Core/Edge fabric is the most frequently deployed topology in current FC 
SAN, as shown in Figure 6.1. It is derived from the star topology, which is common in 
traditional data networks. With a Core/Edge topology, each network device is 
connected to a common core network, known as the backbone.  
A Core/Edge topology is scalable from many perspectives. It is possible to use 
variable size switches in the cores and the edges. The larger the core switch is, the 
larger the fabric can grow. Because the Core/Edge topology is symmetrical, every 
device has an equivalent path to any other device connected to the other switches 
through core switches.  
The problem of the Core/Edge topology is that when one of the inter-switch-
link (ISL) fails, the network becomes asymmetrical. The performance of the whole 
network is affected by the failure link due to the head-of-line blocking issue. 
6.2 Simulation Environment 
As shown in Figure 6.2, simulation experiments are conducted based on the 
Core/Edge network with five switches. FC switches 1,2,3,4 are connected to the core 
switch 5 through a double Inter Switch Link. Eight servers are attached to switches 1 
86 
and 2 (four each) and eight storage devices are attached to switches 3 and 4. Each of 
the servers randomly accesses the storage devices with synthetic I/O workload. To 
focus on network effect, we use very simple storage model to handle I/O requests. The 
overhead of the FC controller on servers and interface to the storage devices are also 
set very small.  
By employing 2G FC, four cases of the network failure are studied. In normal 
mode, there is no ISL failure anywhere. The maximum accumulated nominal 
bandwidth for servers attached to SW1 or SW2 is 400 MB/s.  Second case is for single 
link failure located between server switch SW1 and the core switch SW5 (i.e. ISL1 
fails as shown in Figure 6.2). In this case, Servers 1-4 can only sustain 200MB/s 
throughput, while Servers 5-8 may still achieve the full bandwidth of 400MB/s. When 
the single failure is located on the ISL8 in Figure 6.2, the maximum bandwidth to/from 
storage devices 1~4 is 200MB/s. The forth case is that both ISL1 and ISL8 fail. 
 
Figure 6.2: Simulation Configuration 
6.3 Simulation Results and Analysis  
6.3.1 Throughput Analysis 
Figure 6.3 shows the measured throughput of each server for four cases with 
I/O load increasing. The I/O workloads, which are equally applied to 8 servers, are set 
1 2 3 4 5 6 7 8





















as a fixed request size. The I/O load measured in MB/s is equal to the request size 
times I/O rate. I/O sizes with 2KB, 8KB and 32KB are selected in the simulation. I/O 
requests issued by each server are equally and randomly distributed to all 8 storage 
devices.  
The overflowed I/O load can be applied to the servers in order to get the 
maximum network throughput in the simulation. An I/O scheduler on the server allows 
a certain maximum number (32 in this simulation) of requests to be outstanding. When 
the outstanding I/O number equals to the maximum number (32), the arrived new I/O 
request is put in the server front-end queue. We assume that the depth of front-end 
queue has no limitation. Upon completion of one outstanding I/O request, the 
scheduler takes one request from the front-end queue and sends it out for I/O 































(a) Case I: No link failure 





















0 10 20 30 40 50 60 70 80 90 100 
 
b) Case II: ISL1 failure 
Figure 6.3: Maximal Throughputs under Symmetrical I/O Load for Different 
Cases 
88 





















0 10 20 30 40 50 60 70 80 90 100 
 
c) Case III: ISL8 failure 






















0 10 20 30 40 50 60 70 80 90 100 
 
d) Case IV: Both ISL1 and ISL8 failure 
 
Figure 6.3: Maximal Throughputs under Symmetrical I/O Load for Different 
Cases (cont.) 
Figure 6.3(a) shows the plots of throughput measured at each server for 2KB, 
8KB and 32KB workload for normal case. The throughput linearly grows when the I/O 
load increases, until it reaches a maximum value around 80MB/s. Each server achieves 
almost same throughput. It is noted that the maximum throughput (80MB/s X 4 
=320MB/s for all servers connected to single switch) is 20% less than the nominal 
value (400MB/s). It is also noted that the throughput (82MB/s) for 32KB I/O size is 
slightly larger than that for 2KB (78MB/s). The performance loss is caused by the 
inherent protocol overhead and the head-of-line blocking. 
Figure 6.3(b) shows that the tested throughput varies with the I/O load when 
ISL1 fails. The maximum throughput of Servers 1-4 declines to 45MB/s. With single 
89 
link between SW1 and SW5, the nominal bandwidth should be 200MB/s. The 
measured value (45MB/s X 4 =180MB/s from Servers 1-4) is about 10% less than the 
nominal value. The percentage of throughput lost related to the nominal value is 
smaller than the normal case (10% vs.20%). This is mainly because of the different 
bandwidth utilization for the two cases. At the I/O load of 45MB/s, the bandwidth 
utilization between storage switches SW3, SW4 and the core switch SW5 is smaller 
than that of normal case with I/O load of 80MB/s.  
When I/O load is more than 45MB/s, the throughput of Servers 1-4 remains at 
the maximum value while the throughput of Servers 5-8 increase. The maximum value 
depends on the I/O request size. When the I/O size is 2KB, the maximum throughput 
of Servers 5-8 is about 60MB/s, which is 40% less than nominal value. When the I/O 
size is 32KB, maximum throughput of Servers 5-8 only slightly more than 45MB/s, 
which is far less than the nominal value. The bandwidth utilization of the link between 
SW2 and SW5 is only 45%.  
When the link ISL8 fails, the maximum throughput for all servers drops to 
about 50MB/s or less as shown in Figure 6.3(c). The nominal throughput from storage 
devices attached to SW4 is 200MB/s. Each server gains maximum 25MB/s (200/8) 
from storage devices 1-4. For symmetrical I/O workload, same I/O load goes to storage 
devices 5-8. When the I/O size is 32KB, the measured throughput is around 50MB/s, 
which is almost same as the nominal value. The reason for Servers 1-4 and Servers 5-8 
have same maximum throughput is that the link failure is located on the storage switch 
to core switch. The data traffic from storage devices 1-4 to Servers1-4 and Servers 5-8 
are equally affected by the link failure. 
When both ISL1 and ISL8 fail, the measured throughputs of servers are shown 
in Figure 6.3(d). The throughputs of all servers grow as the I/O load increases until 
90 
40MB/s. When I/O load is more than 40MB/s, the throughputs of Servers 5-8 continue 
to grow while the throughputs of Servers 1-4 drops. When the I/O request size is 
bigger, the change of throughputs is smaller. When I/O request size is 32KB, the 
throughput of Servers 1-4 remains almost constant and the throughput of Servers 5-8 
slightly increases.  
When the I/O size is 2KB, the maximum throughput of the Servers 1-4 varies 
with I/O load as shown in Figure 6.4. But the maximum throughput of the Servers 5-8 
is around 58MB/s, which is much higher than the value 46MB/s in Figure 6.3(c).  In 
order to analyze the detailed activity of the network, data traffic across ISL1-4 and 
ISL5-8 are monitored as shown in Figure 6.5 and Figure 6.6 and the average I/O 
response time of I/Os from each server to each storage device are measured and plotted 
as in Figure 6.7. Servers 1-4 achieve maximum throughput at I/O load of 35MB/s. As 
shown in as Figure 6.7, the I/O response time from Servers 1-4 to storage devices 1-8 
sharply runs up at I/O load of 35MB/s. When I/O load further increases, the response 
time of I/O requests issued by Servers 5-8 accessing storage devices 1-4, notably 
increases compared with lower I/O load. This is caused by the queue time on the 
storage devices 1-4. Due to the growth of this queuing time, the completion time of an 
I/O from Servers 1-4 to storage devices 1-4 increases accordingly. Thus the throughput 
of Servers 1-4 decreases. When the throughput of Servers 5-8 reaches the maximum 
value at about 58MB/s, the drop of throughput to Servers 1-4 stops. Both SW1 and 
SW2 attached servers remain at certain throughput when I/O load further increases. 
SANSim uses close system. So if the storage system is not overloaded, the 
throughput is same when the load is same (see Figure 6.3).  So the throughput does not 
depend on request size. And response time depends on I/O size.   
91 
































































Figure 6.4: Maximum Bandwidth Sustainable for Symmetrical I/O    
 
  































































Figure 6.5: Traffic Delivered to Servers through ISL1-ISL4 
 
 































































Figure 6.6: Traffic Delivered from Storages thought ISL5-ISL8 
92 















































Figure 6.7: I/O Response Time Measured from Servers for Different Device Accesses
 
6.3.2 Network Latency Analysis 
To study the performance on network latency, we conduct a series of 
simulations with different I/O workloads. The data frame latency is from the time 
when the storage device sends a data frame into its frame buffer to the time when the 
server receives the data frame. Figure 6.8 shows the frame latency for the four cases 
when the I/O workload is 50MB/s. Each line in the figures represents the average 
frame latency when server X accesses storage Y. In case I, frame latencies for any 
server to any storage device are very consistent, as shown in Figure 6.8(a). The 
average of all the curves is considered as the reference line for the other cases.  
In case II, the frame latencies for Servers 1-4 and Servers 5-8 are clearly 
separated, shown as A and B respectively in Figure 6.8(b). Both A and B are much 
bigger than the reference latency. The latency for Servers 1-4 for small I/O is almost 
double of that for Servers 5-8. The deviation of the latency becomes much bigger for 
big I/O.  
With ISL8 failure (case III), the latencies for all servers accessing storage 
devices 1-4 and 5-8 are separated, as shown in Figure 6.8(c). The separation and 

























1k   2k   4k   8k   16k  32k  64k  128k 256k 512k 1m   
for accessing storage devices 1-4 is almost four times of that for accessing storage 
devices 5-8. When both ISL1 and ISL8 fail, combining the separation effect of each 
failure, the deviation of the frame latency is magnified as shown in Figure 6.8(d). The 
frame latencies are different for the four different accesses, shown as A, B, C and D, 
respectively in the figures.  
Considering the low I/O workload cases, we also collected the frame latency 
data when the I/O workload is 10MB/s and 30MB/s. The results show that the impact 
of link failure on the frame latency performance is very limited for light workload, 
such as 10MB/s. The worst case for network latency is 0.105ms (32KB I/O size) for 
the servers 1-4 accessing storage devices 1-4 when both ISL1 and ISL8 fail, comparing 
to 0.088ms for normal case.  
When the I/O workload is 30MB/s, the effects of the link failure on the network 
latency become obvious. The network latency increases 80% (32KB I/O size) for 
servers 1-4 accessing storage devices 1-4, when both ISL1 and ISL8 fail.  
 
(a) Case I: No link failure 















A: Server 1-4 to each device























A: Each Server(1-8) to device 1-4
B: Each Server(1-8) to device 5-8
REF LINE
A B























A: Server 1-4 to device 1-4
B: Server 5-8 to device 1-4
C: Server 1-4 to device 5-8
















1k   2k   4k   8k   16k  32k  64k  128k 256k 512k 1m   
 
(b) Case II: ISL1 failure 
 
 
(c) Case III: ISL8 failure 
 
(d) Case IV: Both ISL1 and ISL8 failure 
 
Figure 6.8: Data Frame Latency for 50MB/s I/O Load (cont.) 
6.4 Summary 
In this SANSim application, a novel way of simulating FC SAN is presented. 
The simulation tool SANSim is based on FC frame level, and includes all idle, 
commands and data frames. The FC module of the SANSim is modular and scalable. 
95 
Such tool is deemed to be crucial to the rapid development of high performance SAN 
due to the increasing complexity of the SAN architecture. 
The simulation results show that the Core/Edge topology fabric suffers from 
certain level of bandwidth loss due to the Head-of-Line blocking caused by traffic 
crossing multi-stage switches. When link failure happens to the Inter-Switch-Link, the 
throughput of all servers decreases for high I/O load. With redundant links design, the 
throughput for all servers remains constant for light I/O load, when some links fail. But 
the frame latency and the I/O response time for all servers become worse due to HOL 
blocking. The servers from different location have different I/O performance 







Chapter 7   Conclusion and Future Work 
In this thesis, a platform for FC SAN simulation and design is presented. The 
SANSim, which is based on FC frame level, can simulate all primitive signals (IDLEs, 
RDYs etc), commands and data frames. The design of SANSim is modular and 
scalable. Such tool is useful to the rapid development of high-end SAN due to the 
increasing complexity of the SAN architecture. The SANSim has be validated by 
experimental results of FC network. Through the simulation, it is discovered that the 
Core/Edge topology suffers from certain level of bandwidth loss due to the Head-of-
Line blocking caused by traffic crossing multi-stage switches. 
7.1 Conclusion 
7.1.1 SANSim Simulation Tool 
Storage systems researchers need SAN simulation tool to develop and verify 
new ideas and algorithms for FC storage area networks. Currently there are no 
simulation tools which can simulate the whole SAN environment. Most of the 
97 
simulation studies and tools only focus on functional simulation, and have been very 
limited in modeling and simulation at the FC protocol level. 
SANSim, a new FC SAN simulation and design platform, is developed to aid 
the development and verification of new ideas and algorithms for FC storage area 
networks. It supports FC frame level simulation which fully simulates Fibre Channel 
protocols.  
7.1.2 Simulation Methodology 
 
SANSim uses event driven method to develop the simulation model. The 
simulation model is designed through a bottom-up process and following the virtual 
device concept. This modular design increases the portability of the modules. Some 
modules are implemented as global supporting sub-module so that the higher level 
module can use them as supporting sub-module. SANSim has high resolution 
timestamp mechanism. Open system/close system and queue modules in storage 
system are analyzed. 
7.1.3 Experiments and Validation 
The SANSim simulation model has been calibrated and validated from three 
different aspects. From FC signal transmission level angle of view, the model has been 
verified by checking signal transmission events against the actual FC analyzer’s traces. 
From the general I/O performance trends point of view, it has been proven that the 
simulation model outcome agrees well with the expectation. We have demonstrated its 
accuracy by comparing simulated result with actual experimental measurement. The 
results show that SANSim model is accurate to within less than 3% in read operation, 
and less than 10% in write operation. 
98 
7.1.4 SANSim Application 
SANSim can be used to design SAN architecture. We can also use it to develop 
and verify of new ideas and algorithms for FC storage area networks. Also SANSim 
can be used effectively in uncovering hidden protocol design problems in existing 
system.   In the thesis, the performance and availability of a Core/Edge FC network 
have been analyzed. The simulation results show that the Core/Edge topology suffers 
from certain level of bandwidth loss due to the Head-of-Line blocking caused by 
traffic crossing multi-stage switches.  
7.2 Future Work 
Future research work may focus on Object-based Storage Device architecture 
simulation. Main future development work of SANSim includes 
• IP storage module, like iFCP and iSCSI. 
• Object-based Storage Device module. 
• SAN File System: Enable multiple Hosts accessing the same LUN. 
• Cluster HA: Enable Host redundancy. 




[1] Y.L Zhu, C.Y. Wang, W.Y. Xi, and F. Zhou, “SANSim: a Simulation and Design 
Platform of Storage Area Network”, NASA/IEEE Conference on Mass Storage 
Systems and Technologies (MSST2004), USA 2003, pp 101-112. 
[2] C.Y. Wang, F. Zhou, Y.L. Zhu, C.T. Chong, B. Hou, and W.Y. Xi, “Simulation 
and Analysis of FC Network”, the 28th Annual IEEE Conference on Local 
Computer Networks (LCN 2003), Germany 2003, pp 285-286. 
[3] C.Y. Wang, F. Zhou, Y.L. Zhu, C.T. Chong, B. Hou, and W.Y. Xi, “Simulation of 
Fibre Channel Storage Area Network Using SANSim”, IEEE 11th  International 
Conference on Network (ICON2003), Australia 2003, pp 349-354. 
[4] Y.L. Zhu, S.Y. Zhu and H. Xiong, “Performance Analysis and Testing of the 
Storage Area Network”, 19th IEEE Symposium on Mass Storage Systems and 
Technologies, USA 2002.  
[5] J. Gray, “Storage Bricks Have Arrived”, Keynote in Conference on Fast and 
Storage Technologies (FAST’2002), USA 2002. 
[6] A. Brown, D. Oppenheimer, K. Keeton, R. Thomas, J. Kubiatowicz, and D. 
Patterson, “ISTORE: Introspective Storage for Data-Intensive Network Services”, 
Proceedings of the 7th Workshop on Hot Topics in Operating Systems (HotOS-
VII), March 1999. 
[7] Y. Chen, L. Ni, C.Z. Xu, J. Kusler, and P. Zheng, “CoStore: A reliable and highly 
available storage systems”, Proc. of the 16th Annual Int'l Symposium on High 
Performance Computing Systems and Applications, Canada  2002, p 3. 
[8] J. Kubiatowicz, D. Bindel, P. Eaton, Y. Chen, D. Geels, R. Gummadi, S. Rhea, W. 
Weimer, C. Wells, H. Weatherspoon, and B. Zhao, “OceanStore: An architecture 
100 
for globalscale persistent storage”, ACM SIGPLAN Notices, 35(11), November 
2000, pp 190-201. 
[9] A. Veitch, E. Riedel, S. Towers, and J. Wilkes, “Towards Global Storage 
Management and Data Placement”, Technical Memo HPL-SSP-2001-1, HP Labs, 
March 2001. 
[10] G. Gibson, and R. Meter, “Network Attached Storage Architecture”, 
Communications of the ACM, Vol. 43, No 11, November 2000, pp37-45. 
[11] R. Hernandez, C. Kion, and G. Cole, “IP Storage Networking: IBM NAS and 
iSCSI Solutions”, Redbooks Publications (IBM), SG24-6240-00, June 2001. 
[12] E. Miller, D. Long, W. Freeman, and B. Reed, “Strong Security for Network-
Attached Storage”, Proc. of the Conference on Fast and Storage Technologies 
(FAST’2002), Monterey, CA, January , 2002, pp 1-14. 
[13] D. Nagle, G. Ganger, J. Butler, G. Goodson, and C. Sabol, “Network Support 
for Network-Attached Storage”, Hot Interconnects’1999, August 1999. 
[14] R. Khattar, M. Murphy, G. Tarella and K. Nystrom, “Introduction to Storage 
Area Network”, Redbooks Publications (IBM), SG24-5470-00, September 1999. 
[15] Nishan System white paper, “Storage over IP (SoIP) Framework – The Next 
Generation SAN”, June 2001. 
[16] B. Phillips, “Have Storage Area Networks Come of Age?” IEEE Computer, 
Vol. 31, No. 7, 1998, pp 10-12. 
[17] L. Cherkasova, V. Kotov, T. Rokicki, “Fibre Channel Fabrics: Evaluation and 
Design”, Proceedings of the 29th Annual Hawaii International Conference on 
System Sciences, 1996, pp 53-62. 
101 
[18] X. Molero, F. Silla, V. Santonja and J. Duato, “On the Switch Architecture for 
Fibre Channel Storage Area Network”, Proceeding of 8th International Conference 
on Parallel and Distributed Systems (ICPADS-2001), June 2001, pp 484-491. 
[19] X. Molero, F. Silla, V. Santonja and J. Duato, “A Tool for The Design And 
Evaluation Of Fibre Channel Storage Area Networks”, Proceedings of 34th 
Simulation Symposium, 2001, p 113. 
[20] P. Berenbrink, A. Brinkmann and C. Scheideler, “SIMLAB - A Simulation 
Environment for Storage Area Networks”, 9th Euromicro Workshop on Parallel 
and Distributed Processing (PDP), 2000, pp 119-128. 
[21] M. Jurczyk, “Performance and Implementation Aspects of Higher Order Head-
of-Line Blocking Switch Boxes”, Proceedings of the 1997 International Conference 
on Parallel Processing, IEEE 1997. p 49. 
[22] J. Bucy and G. Ganger, “The DiskSim Simulation Environment Version 3.0 
Reference Manual”, Carnegie Mellon University Technical Report CMU-CS-03-
102, January 2003.  
[23] M. Busari and C. Williamson. “On the sensitivity of web proxy cache 
performance to workload characteristics”, IEEE Infocom 2001, pp 1225-1234. 
[24] FC-AL, “FC Arbitrated Loop”, ANSI X3.272:1996. 
[25] FC-PH, “Fibre Channel Physical and Signaling Interface (FC-PH)”, ANSI 
X3.230:1994.  
[26] FC-SW, “FC Switch Fabric and Switch Control Requirements”, ANSI 
X3.950:1998.  
[27] Technical Committee T11, FC Projects http://www.t11.org/Index.html. 
102 
[28] G. R. Ganger, “Generating representative synthetic workloads: An unsolved 
problem”, In Proceedings of the Computer Measurement Group Conference, 
December 1995, pp 1263-1269. 
[29] J. Wilkes, “The Pantheon storage-system simulator”, Technical Report HPL--
SSP--95--14. Storage Systems Program, Hewlett-Packard Laboratories, Palo Alto, 
CA, December 1995. 
[30] C. Zhu, “SimSANs: Simulating Storage Area Networks”, SAN Simulation tool, 
http://simsan.storwav.com/. 
[31] SANmetrix, http://www.storagecorp.com/. 
[32] SANTK, http://www.borg.umn.edu/fc/SANTK/. 
[33] M. E. Gomez and V. Santonja, “A new approach in the analysis and modeling 
of disk access patterns”, In Performance Analysis of Systems and Software 
(ISPASS 2000), IEEE, April 2000, pp 172-177. 
[34] M. E. Gomez and V. Santonja, “A new approach in the modeling and 
generation of synthetic disk workload”, In Proceedings of the 8th International 
Symposium on Modeling, Analysis and Simulation of Computer and 
Telecommunication Systems, IEEE, 2000, pp 199-206. 
[35] S. D. Gribble, G. S. Manku, D. Roselli, E. A. Brewer, T. J. Gibson, and E. L. 
Miller, “Self-similarity in file systems”, In Proceedings of SIGMETRICS, 1998, pp 
141-150. 
[36] B. Hong and T. Madhyastha, “The relevance of long-range dependence in disk 
traffic and implications for trace synthesis”, Technical report, University of 
California at Santa Cruz, 2002. 
[37] B. Hong, T. Madhyastha, and B. Zhang, “Cluster-based input/output trace 
synthesis”, Technical report, University of California at Santa Cruz, 2002. 
103 
[38] Qlogic Corp., http://www.qlogic.com. 
[39] R. Love, “Linux kernel development”. 
[40] D. P. Bovet, M. Cesati, “Understanding the Linux Kernel, 2nd edition”. 
[41] A. Rubini, J. Corbet, “Linux Device Drivers, 2nd Edition”. 
[42] R. V.Boppana and S. Chalasani, “Fault-tolerant wormhole routing algorithms 
for mesh networks”, IEEE Transactions on Computers, vol.44, no. 7, July 1995, pp 
848-864. 
[43] R. Casado, A Bermudez, F.J. Quiles, J.L. Sanchez and J. Duato, “Performance 
evaluation of dynamic reconfiguration in hi-speed local area networks”, 
Proceedings of the 6th International Symposium on High-Performance Computer 
Architecture, January 2000. 
[44] A. A. Chien and J. H. Kim, “Planar-adaptive routing: Low-Cost adaptive 
networks for multiprocessors”, Journal of the ACM, vol. 42, no. 7, 1995, pp 91-
123. 
[45] G. Ciardo, L. Cherkasova, V. Kotov and T.Rokicki, “Modeling a fibre channel 
switch with stochastic Petri nets”, Proceedings of ACM SIGMETRICS and 
PERFORMANCE’95, 1995, pp 319-320. 
[46] The Network Simulator – NS2, http://www.isi.edu/nsnam/ns/. 
[47] Y. Song, D. Cypher, and D. Su, “Simulation and Performance of PNNI ATM 
Networks”, Proceedings of the 7th International Conference on Telecommunication 
Systems Modeling and Analysis, March 1999, pp 387-401. 
[48] H. Schwetman, “CSIN: A C-Based Process-Oriented Simulation Language”, 
Proc. of the 1986 Winter Simulation Conference, J.R. Wilson, J.O. Henriksen, and 
S.D. Roberts, Eds. IEEE, 1986, pp 387–396. 
104 
[49] G. Edwards and R. Sankar, “Modeling and Simulation of Networks Using 
CSIM”, Simulation, vol. 58, no. 2, 1992, pp131–136. 
[50] J.A. Hamilton, Jr., G.R. Ratterree, P.C. Brutch, and U.W. Pooch, “Public 
Domain Tools for Modeling and Simulating Computer Networks”, Simulation, vol. 
67, no. 3, 1996, pp 161-169. 
[51] Brocade Communications Systems, Inc., http://www.brocade.com/. 
[52] IOMeter, an I/O subsystem measurement and characterization tool, 
http://www.iometer.org/. 
 
 
 
 
 
 
 
