Abstract-The paper presents models and analytical techniques for studying system behavior of an interrupt-driven kernel due to high packet arrival rate found in gigabit networks. An analytical study is presented describing the impact of high interrupt rate on system performance. The performance is studied in terms of throughput, latency, and system power. Equations are derived for system throughput, latency, power, and stability condition. Results from both reported experimental findings and simulations show that our analytical model is valid and give a good approximation. To the best of authors' knowledge, the impact of interrupts on system performance had never been studied analytically in the past, and this analytical work is the first of its kind.
INTRODUCTION
Interrupt overhead of Gigabit network devices can have a significant negative impact on system performance. Traditional operating systems were designed to handle network devices that interrupt on a rate of around 1000 packets per second, as is the case for 10Mbps Ethernet. The cost of handling interrupts in these traditional systems was low enough that any normal system would spend only a fraction of its CPU time handling interrupts.
For 100Mbps Ethernet, the interrupt rate increases to about 8000 interrupts per second using the standard maximum 1500 byte packets. However for Gigabit Ethernet, the interrupt rate for the maximum sized-packet of 1500 bytes increases to 80,000 interrupts per second. Of course with 10 Gigabit Ethernet and considering smaller packets, the problem is much worse.
In Gigabit networks, the packet arrival rate surpasses the system packet processing rate which includes network protocol stack processing and interrupt handling. With Gigabit Ethernet and a rate of 80,000 interrupts per second for a minimum sized packet of 512 bytes, the CPU must handle an interrupt in less than 4 µs in order to keep up with such a rate. According to [1] , a null system call (not an interrupt) on a typical 666 MHz Intel Pentium III takes on the order of 10 µs! Also, a typical latency for handling interrupt due to a packet arrival in Linux is in the order of 50 µs! Interrupt-driven systems tend to perform very badly under such heavy load conditions. Interrupt-level handling, by definition, has absolute priority over all other tasks. If interrupt rate is high enough, the system will spend all of its time responding to interrupts, and nothing else will be performed; and hence, the system throughput will drop to zero. This situation is called receive livelock [2] . In this situation, the system is not deadlocked, but it makes no progress on any of its tasks, causing any task scheduled at a lower priority to starve or not have a chance to run. At low packet arrival rates, the cost of interrupt overhead and latency for handling incoming packets are low. However, interrupt overhead cost directly increases with an increasing of packet arrival rates, causing receive livelock.
The receive livelock condition was shown by experiments and measurements in real systems [3, 4] . In this paper we present a model for the receive livelock phenomenon and show its analytical solution. These models can be utilized to understand and predict the performance and behavior of interrupt-driven systems and can be served as a reference model for comparing the performance of these proposed solutions to resolve the receive livelock condition. More importantly, the paper presents an analytical study of system performance in terms of throughput, latency, and system power due to high rate of interrupts found in Gigabit networks.
A number of solutions have been proposed to minimize the interrupt overhead and resolve receive livelock condition. Such solutions include interrupt coalescing, OS-bypass protocol, zero-copy, jumbo frames, polling, pushing some or all protocol processing to hardware, etc. Some of these solutions are listed in [2, 3, 4, 5, 6] . However none of these solutions or others, to the best of our knowledge, modeled and studied analytically the performance and behavior of system performance under heavy network loads.
The rest of the paper is organized as follows. Section 2 presents analysis for two models: an ideal system that ignores the impact of interrupts on system performance, and a second model that captures the system behavior under low and high network traffic intensity. Numerical examples are given in Section 3. A note on the accuracy of the analysis is given in Section 4. Finally, Section 5 has the conclusion and identifies future work.
ANALYSIS
In this section we present an analytical study to examine the impact of interrupts on system performance. At first we define system parameters. Let λ be the average incoming packet arrival rate, and µ be the average protocol processing rate by the kernel. Therefore 1/µ is the time it takes the system to process the incoming packet and deliver it to the application program. This time includes primarily the network protocol stack processing by the kernel, excluding any interrupt handling. However, the interrupt handling time will be denoted as T ISR , which is basically the interrupt service routine time for handling incoming packet. We will also denote ρ as a measure of the traffic intensity or system load and is defined as λ /µ .
We study the system performance in terms of three commonly-used performance metrics. These metrics include throughput, latency, and system power. System throughput (γ) is the rate at which packets are delivered by the kernel to the application program. Latency or the mean response time (R) which is the time duration between a packet arrival at the NIC and its delivery to the application program. Since an improvement in system throughput would a have a negative impact on latency, and vice versa, system power (P) was proposed in [8] which resolves this contradiction. System power gives the correct operating point that maximizes throughput and minimizes latency.
Ideal System
This section presents analysis for the ideal situation in which the overhead involved in generating interrupts is totally Note. It is worth mentioning that in our analysis we assume a Poisson arrival for network traffic. It is has to be stated that that network traffic is not always Poisson in nature. However, such assumption makes analysis tractable. As we will demonstrate in Section 4 and 5, it turns out that our model with those assumptions including that of a Poisson arrival is a good approximation to an experimental model with real network traffic.
In M/M/1/B model, the system throughput can be expressed as
where p 0 is the probability that the system is idle and given by
System packet processing latency R can be given by , and p B is the probability of packet being dropped due to buffer being full.
And system power is expressed by [8] as
where α is a positive real number and is a tunable parameter. Normally, α = 1 where increasing throughput and decreasing latency are given equal weight. For our study we will set α = 1.
Impact of Gigabit-Network Interrupts
Modeling an interrupt-driven system is a challenging task especially when we consider the Gigabit networking environment where ρ > 1. For every incoming packet, an interrupt is initiated. The system processes the packet by first executing the ISR and then handing it to the protocol stack where it gets processed.
Hence, the system protocol processing time per packet is simply equal to T ISR + µ 1 . However the value of this processing time is not true all the time and it depends on the arrival time of the next packet. If the next packet arrives while handling the interrupt of a previous packet, i.e., while the system execution has not finished the current ISR, the value of this process time will be T ISR + µ 2 . This is true since the new interrupt is being masked off because another interrupt of the same interrupt priority level is being serviced. So a new T ISR is not incurred. However, kernel time to process 2 packets by the protocol stack will be µ 2 .
As a good design practice, we would like to minimize the execution time of the ISR as much as possible. Therefore, we assume the primary job of the ISR is to notify the kernel of the arrival of a new packet. The notification only happens after the packet is copied by the DMA to the system host memory. This assumption is valid since in gigabit networking environment, the use of DMA becomes necessary in order to eliminate any CPU overhead involved in copying packets from the NIC to kernel memory. Major network vendors equip Gigabit NICs with DMA engines. These suppliers include 3Com, HP, Alteon owned now by Nortel, Sundace, and NetGear.
After the notification of the arrival of a new packet, the kernel will process the packet by first examining the type of frame being received and then invoking immediately the proper handling stack function or protocol, e.g. ARP, IP, TCP, etc. The packet will remain in the kernel or system host memory until it is discarded or delivered to the user program or application.
We also assume that the protocol processing for packets by the kernel will continue as long as there are packets available in the system memory buffer. However, this protocol processing of packets can be interrupted by ISR executions as a result of new packet arrivals. This is so because packet processing by the kernel runs at a lower priority than the ISR.
One may think that such an interrupt-driven system can be simply modeled as a priority queueing system with preemption in which there are two arrivals of different priorities. The first arrival constitutes that for ISRs and has the higher priority. The second arrival is the arrival for incoming packets, and has the lower priority. As noted the ISR execution preempts protocol processing.
However this is an invalid model because ISR handling is not counted for every packet arrival. ISR handling is ignored if the system.is servicing another interrupt of the same level. In other words, if the system is currently executing another ISR, the new ISR which is of the same priority interrupt level will be masked off and there will be no service for it. 
Mean Effective Service Time
In this section, we find the mean effective service time for processing packets in the kernel protocol stack. We first find the formula for the mean effective service time. Knowing this formula, the system can be modeled as an M/G/1 queue with a Poisson packet arrival rate of λ and a mean effective service rate of µ ′ that takes a general distribution.
As illustrated in Figure 1 , the effective service time is the actual time available for servicing a packet, exclusive of T ISR disruption. The available service time is the available time between successive T ISR 's. If a packet or multiple packets arrive during T ISR , we will have batched or masked-off interrupts and the packets will be queued into the system with effectively one T ISR disrupting the service time. Therefore, the disruption of the service time is mainly influenced by the arrival rate of the packets λ and T ISR .
Let us assume that T ISR is exponentially distributed with mean T ISR = r 1 . One can express the mean effective service rate as: µ′ = Rate at which packets are processed by the kernel's network protocol with no interrupt disruption. Therefore, µ ′ = ⋅ µ (% CPU availability for protocol processing).
(2.4)
In order to determine the CPU availability percentage for protocol processing and interrupt handling, we use a Markov process to model the CPU usage, as illustrated in Figure 2 . The process has state (0,0) and states (1,n). State (0,0) represents the state where the CPU is available for protocol processing. States (1,n) with 0 < n < ∞ represent the state where the CPU is busy handling interrupts. n denotes the number of packet arrivals that are being batched or masked off during T ISR . Note that when process is in state (1,0), this means there are no interrupts being masked off and the CPU is handling a single interrupt. The steady-state difference equations can be derived from 0 = pQ, where
and Q is the ratetransition matrix and is defined as follows However, the mean service rate µ will be replaced by the mean effective service rate µ ′ . Hence, the system throughput γ , latency R, and power P are expressed by equations (2.1), (2.2), and (2.3), respectively.
A particular point of interest is finding the stability condition for the system. The stability condition is the situation where 1 < ρ , or is defined as the "cliff" point for system throughput. It is where the throughput starts falling to zero as the system load increases. The stability condition for the system can be expressed as:
Solving for λ, we get: Since the term under the square root is always greater than one then the negative sign is neglected. Therefore, the system will be stable whenever
Another interesting point is finding the maximum system power point. This point is also the system correct operating point which gives maximum throughput and the minimum latency. In order to accomplish this, we take the derivative of the power function with respect to λ, and solving the derivative after making it equal to zero. From [9] , the maximum power point occurs when ρ < 1. Hence, it is suitable to model the system in this case only as M/M/1, since there is no need to consider the case when ρ > 1 as we all along assumed. For this case, the throughput and latency as a function of λ are denoted by ) (λ γ and R(λ), respectively.
Thus, the maximum power point occurs when
Special Case
We consider a special case when interrupt handling is ignored, i.e., when T ISR = 0. In this situation when T ISR = 0, r → ∞. We prove that equations (2.5), (2.6), and (2. 
NUMERICAL EXAMPLES
In this section, we report some numerical results of our analytical model to study the behavior of the system and the impact of interrupts on system performance. The system performance is studied as a function of traffic intensity ρ.
Numerical results are also given for the ideal system when ignoring interrupts. For all of these results, we fix µ to 1 and B to a size of 1000.
We first examine the system throughput as a function of traffic intensity ρ. Figure 3 depicts the impact of high and low traffic intensity on system throughput. We note for the ideal system, the throughput is the expected one and matches very closely to the behavior of receive livelock. However, the throughput is different when considering interrupts impact, i.e., the receive livelock phenomenon. We note that the throughput doesn't fall rapidly to zero due to interrupt batching as illustrated in Section 2.3. Figure 3 shows the system throughput for three cases of T ISR 0.2, 0.3, and 0.5. It is noted that as the interrupt overhead increases, i.e., increasing the value of T ISR , the system throughput is degraded and the livelock phenomenon occurs earlier. Figure 3 also shows the cliff points for the system throughput. As previously defined, the cliff points are those points where system throughput starts falling to zero as the system load increases. As shown, the cliff points in terms of traffic intensity ρ for T ISR of 0.2, 0.3, and 0.5 are 0.85, 0.81, and 0.73, respectively. Since we are fixing µ to 1, the cliff points are the same for the system throughput, traffic intensity, and packet arrival rate. These points match exactly the points derived by equation (2.6) for finding the stability condition. Figure 4 illustrates the relation between packet latency and traffic intensity for the same system parameter values considered for system throughput. It is shown that the latency for the ideal system is the least and it is the worst when T ISR is equal to 0.5.
The impact of low and high traffic intensity on system power is shown in Figure 5 . In the ideal system, the maximum system power is when ρ = 0.5. However, the maximum system power decreases with different values of T ISR, , giving the least value for T ISR = 0.5. In addition the figure shows that the maximum power point for the system for T ISR of 0.2, 0.3, and 0.5 are for λ of 0.46, 0.45, and 0.41, respectively. These points match also exactly the points derived by equation (2.7) for finding λ that gives the maximum power point. 
VERIFICATION AND VALIDATION OF ANALYSIS
In order to verify our analytical model, we built a discreteevent simulation using C programming and ran a wide number of simulation runs. In all cases, a perfect accordance has been verified. The analysis was also verified by proving that all derived equations yield the same as these of the ideal system model when considering the special case of ignoring the handling of interrupts. In addition, our analytical results were compared to results from experimental findings reported by [3, 4] , in particular for system throughput. Our analytical results are very much inline with these reported experimental results.
CONCLUSION
We presented a valid analytical model that captures the behavior of interrupt-driven systems when subjected to high interrupt rates. We proposed and studied two models: an ideal system that ignores the impact of interrupts on system performance, and a second model which captures the system behavior under low and high traffic intensity. Simulation and reported experimental results show that our analytical model is valid and give a good approximation. As a further study, we will evaluate the performance of the different proposed solutions for decreasing interrupt overhead and resolving the receive livelock problem.
