This paper examines the latency in Internet path failure, failover and repair due to the convergence properties of interdomain routing. Unlikeswitches in the public telephonynetwork which exhibit failover on the order of milliseconds, our experimental measurements show thatinter-domain routers in the packet switched Internet may take tens of minutes to reach a consistent view of the network topology after a fault. These delays stem from temporary routing table oscillations formed during the operation of the BGP path selection process on Internet backbone routers. During these periods of delayed convergence, weshow that end-to-end Internet paths will experience intermittent loss of connectivity, as well as increased packet loss and latency. We present a two-year study of Internet routing convergence through the experimental instrumentation of key portions of the Internet infrastructure, including both passive data collection and fault-injection machines at major Internet exchange points. Based on data from the injection and measurementofseveral hundred thousand inter-domain routing faults, we describe several unexpected properties of convergence and show that the measured upper bound on Internet inter-domain routing convergence delay is an order of magnitude slower than previously thought. Our analysis also shows that the upper theoretic computational bound on the number of router states and control messages exchanged during the process of BGP convergence is factorial with respect to the number of autonomous systems in the Internet. Finally, we demonstrate that much ofthe observed convergence delay stems from speci c router vendor implementation decisions and ambiguity in the BGP speci cation
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.