Abstract: A new power management scheme enables us to apply the power-gating technique to a 200-Gbps packet forwarding circuit. The circuit is equipped with two 100-Gbps forwarding engines (FEs), each of which processes inbound and outbound packets. When the amount of traffic decreases and only one forwarding engine is capable of processing all incoming packets, a packet-route switch installed in the data path routes all packets to one of the two FEs and the other FE is shut off. This technique reduces the power consumption by 14% when the total traffic rate is less than half of the maximum rate of 200 Gbps.
Internet traffic is greatly increasing as broadband services such as the LTE smart-phones spread, and there is a growing demand for higher transmission speeds in networks. The 40-Gbps/100-Gbps Ethernet has been standardized and the transmission data rate will reach over 100 Gbps at an optical wavelength in the trunk network [1, 2] . The routers, which forward packets transmitted in core and access networks, will require higher transmission speeds, more switching capacity, and, simultaneously, reduced power consumption to realize energy-efficient networks for the "green of information and communications" [3] . The packet-forwarding LSIs in a router buffer the packets from network interfaces (NIF), modify the packet headers, and output them to the cross-bar switch (CSW) in order to forward them to their destination. Overall throughput of 400 Gbps per chassis is considered a target for the next generation of high-end routers. Therefore, a forwarding ability of 100 or 200 Gbps per chip is desirable for the LSIs.
To reduce power consumption, sleep techniques in circuit operation, such as clock and power gating, have been widely implemented in various kinds of LSIs. For example, high-performance multi-core processors using the power-gating technique with task scheduling have been reported [4] . However, reports on a power-gating scheme for high-throughput packetforwarding LSIs are rarely found, mainly because of the difficulty of ensuring unpredictably arriving packets are forwarded while circuits are turned off. Recently, power reduction of a router circuit has been achieved by using commercially available automatic clock-gating design tool [5] , but this was an only a simulation work and the function of the router was limited.
In this paper, we propose a power management technique for a 200-Gbps packet-forwarding LSI, which employs two 100-Gbps forwarding engines (FEs) and applies power gating to one of the two FEs while continuing packet-forwarding operation using the other one. With this architecture, we will show that we can devise a reliable circuit-operation procedure that ensures no packets are lost during the power-gating cycles.
We set the requirements for the devised technique as follows: 1. The maximum forwarding throughput should reach up to 200 Gbps. 2. When the total amount of traffic is reduced to less than a half of the maximum forwarding throughput, power consumption should be reduced by power gating of one of two FEs. 3. Power shut-off (PSO) and recovery from PSO should be possible while packets are being forwarded. During this time period, packet-loss and packet-sequence problems should never be permitted.
Requirement 3 is very important for making the packet-forwarding operation reliable.
Power reduction scheme
To fulfill these requirements, we focused on the fact that the process for forwarding inbound packets (packets from an NIF; hereafter i-packets) and outbound packets (packets from a CSW; o-packets) is almost the same. On the basis of this similarity, we can make the forwarding circuit such that i-packets and o-packets share the same FE. We compose the circuit with a packet-route switch and two 100-Gbps power-gated FEs, one of which can be shut off depending on the traffic conditions. Figure 1 (a) shows a block diagram of the forwarding circuit. The packetroute switch is able to put the i-and o-packets into either of the FEs. In Fig. 1(b) , mode A illustrates the operation of 200-Gbps packet-forwarding. Here, i-and o-packets are processed in each dedicated FE. When the total data rate of i-and o-packets decreases to less than the forwarding capacity of FE2, the switch changes the packet route and puts i-and o-packets into FE2, and FE1 [including the internal packet buffer (PB1)] is shut off (mode B).
Block diagram
Another power-saving mode (mode C) is to use FE1 for processing both the i-and o-packets, provided that the total data rate is less than the forwarding capacity of FE1. In this case, FE2 and the memory management unit (MMU) are shut off. External packet buffer (PB2) is also shut off in mode C.
Packet-route switch
The packet-route switch consists of per-lane FIFO register files, a packet selector (SEL) for each FE, and a read-out/packet route controller (Cont.). Pausing read-out from the register file and restarting read-out after changing the packet route avoid packet loss. In the same way, packet loss is avoided while packets are read out from the FEs. Controlling the read-out packet routes for each operation mode.
IEICE Electronics
order from each FE eliminates the packet-sequence problem. Fig. 2(a) shows the procedures for changing the packet route, shutting off power, turning on power, and recovering the packet route. Fig. 2(b) and Fig. 3 show a schematic and the timing chart for the packet-route switch, respectively.
I-packets and o-packets are normally input to FE1 and FE2, respectively [State (1) in Fig. 2(a) and Fig. 3 ]. When the total data rate of i-and o-packets is less than the forwarding capacity of FE2, we first let the output from FE2 be disabled for i-packets until the process for foregoing i-packets in FE1 completes [State (2); OEN_2(for i-packets) = "disable"], and the route of i-packets is changed to FE2 [State (3)]. To change the route of packets read out from the o-FIFO to the i-FIFO, for example, the read-out/packet route controller turns Sel_2 for i-packets to "enable" instead of Sel_1. Restarting read-out after changing the packet route avoids packet loss [State (3a)-(3c) in Fig. 3] .
After the last i-packet is forwarded by the FE1 and PB1 becomes empty [State (4)], the output path for i-packets can be set to "from PB2" [State (5)]. At this moment, we can make PB2 output be enabled for i-packets [State (6); OEN_2(for i-packets) = "enable"], and then the first i-packet forwarded by the FE2 and PB2 is transmitted. This PB1 monitoring and PB2 read-out control eliminates the packet-sequence problem. FE1 (including PB1), which has no packet to forward, is shut-off [State (7)], which reduces the power consumption. When the total data rate of i-and o-packets becomes larger than the threshold value for FE2, FE1 (including PB1) is turned on [State (8)] and, through the same procedure as States (2)-(6), the route of i-packets is returned to the original one [State (9)-(13)], while avoiding packet-loss and packet-sequence problems.
We implemented the packet-route switch compactly using cascaded 2:1 selectors, which selects i-and o-packets from 42 lanes of various serial data inputs and controls the data flow to either FE1 or FE2 at the total data rate of 200 Gbps. The switch must select the route for packet buses with 42 × 512-bit width (i.e., the total number of the signal bus lines is over 20,000). As shown in Fig. 2(b) , both packet-route selectors to PB1 and PB2 are compactly composed of 41-step cascaded 512-bit 2:1 selectors. Compactness is achieved because the 2:1 selectors are distributed within the FIFO register circuits and because 512-bit selected signal lines only have to be placed to reach the next step. This compactly composed and simply controlled switch makes it possible to change the packet route immediately while forwarding packets successfully. Fig. 4 shows an output eye diagram, which is one of the twelve parallel 10.3125-Gbps Interlaken data outputs of 100 G data transmission (Total bandwidth of this interface is designed to have 120-Gbps transmission capability). The bit error rate (BER) of the transmission was measured at the end point of forwarded packets with a packet analyzer. The end-to-end BER, including packet-forwarding operation, is lower than 7.31×10 À14 . Fig. 3 . Timing chart for controlling the read-out timing and packet route.
Experimental results
IEICE Electronics Express, Vol.10, No.11, 1-6
Fig . 5 shows the relationship between the forwarding throughput and power consumption. The maximum total data rate of i-and o-packets is 200 Gbps. From 200 to 100 Gbps, both FE1 and FE2 are employed for forwarding packets. Static leakage power is suppressed a good deal and dynamic power favorably reduces as the load of packet-forwarding falls off in accordance with theory. When the total data rate is smaller than the forwarding capacity of FE2 (100 Gbps), the power for FE1 is shut off while both the i-and o-packets are forwarded at FE2, and the power consumption is further reduced by 14%. This makes it possible for this LSI to forward packets at a high datarate of 100 Gbps with power as low as that in the stand-by state (no packet forwarding) when the PSO is not applied. Moreover, the lower the total data rate is, the lower the power consumption becomes. The measured power consumption includes that of the logic core, internal memory, PLL/DLL, and I/O cores (excluding that of I/O drivers). Power was measured at room temperature with fan-cooling.
Conclusion
We have demonstrated the first power gating technique for a 200-Gbps packet-forwarding LSI. The packet route can be changed and one of two 100-Gbps forwarding engines is shut off when the total amount of traffic becomes less than a half of maximum forwarding throughput. Both a 14% power reduction and 200-Gbps throughput are achieved. The technique can greatly improve the throughput and power efficiency of high-end routers. 
