In this paper, a 40 Gbps cascaded bit-interleaving passive optical network (CBI-PON) is proposed to achieve power reduction in the network. The massive number of devices in the access network make that power consumption reduction in this part of the network has a major impact on the total network power consumption. Starting from the proven BiPON technology, an extension to this concept is proposed to introduce multiple levels of bit-interleaving. The paper discusses the CBI protocol in detail, as well as an ASIC implementation of the required custom CBI Repeater and End-ONT.
Introduction
When the World Wide Web was first introduced in the 1990s, a minimal range of services was offered to a very limited set of end users. However, since then the number of services has vastly increased, as has the number of end 5 users [1] . These two factors have led to an ever-increasing demand for higher bandwidth, causing the data rates in the access networks to rise very quickly. Along with these rising data rates, the power consumption of the numerous optical network units (ONUs) has increased. Today, the 10 Internet accounts for about 10 % [2] of the global power consumption. It is clear that a large contribution like this has a significant environmental impact [3, 4] . The emergence of new technologies such as the move to the cloud and the Internet of Things, along with a list of develop- 15 ing countries where the number of internet connections is strongly rising, assure a growth in the demand for higher bandwidth in the foreseeable future. Together with this increase, the importance of its effects on the environment grows massively. In order to sustain the welfare people 20 have gotten used to in the developed world, we will have to find ways to reduce the pressure our society is putting on the earth and its resources. One of these ways is trying to significantly reduce the power consumption of the Internet, due to its large contribution to the global power consump-
25
tion. An example of these efforts is the bit-interleaving passive optical network (BiPON) protocol, introduced in [5, 6] in the context of GreenTouch. This was a radical paradigm-shift from the current dogmatic use of packetbased communication, demonstrating a massive reduction 30 on the power consumption. A BiPON ONU ASIC was designed and implemented [7] , resulting in a 10 Gbps system which demonstrated a power saving factor from 35 to 180, depending on the downstream rate [6] .
In this paper, we propose the 40 Gbps BiPON protocol, 35 accommodating the need for a 4× data rate increase, while achieving a low power consumption. Additionally, we introduce the cascaded bit-interleaving PON (CBI-PON) concept [8, 9, 10] : applying the bit-interleaving technique on different levels throughout the access network to reduce 40 the power consumption even more.
Section 2 starts by explaining the concept of a bitinterleaving PON (as introduced in [7, 6] ) as an alternative to the traditional packet-based PON. In section 3 an elaborated explanation of the cascaded extension of BiPON,
45
CBI-PON, is given. Section 4 focuses on the 40 Gbps implementation of the CBI-PON protocol, whereas section 5 discusses the required analog front-end. In section 6 the digital processing in the 40 Gbps CBI-PON network devices is explained. Section 7 introduces the 40 Gbps ASIC 50 implementation named CABINET. The power consumption reduction is addressed in section 8, followed by the conclusion in section 9.
Bit-interleaving Passive Optical Network

Packet-based Time Division Multiplexing
55
A traditional PON uses time division multiplexing (TDM) to allow the optical line terminal (OLT) to reach multiple ONUs on a shared optical fiber. In this scenario, each ONU is appointed a time slot in which this ONU receives a designated data packet, avoiding collision problems. How-60 ever, all data sent from the OLT reaches every ONU, which then has to process all this data to extract the useful data and discard all data destined for other ONUs. This is shown schematically in Figure 1 .
Bit-based Time Division Multiplexing
An alternative approach to packet-based time division multiplexing is to reduce the appointed time slots to 1 bit instead of 1 packet. Every successive bit is therefore intended for a different ONU. This is illustrated in Figure 2 .
Compared to packet-based TDM, bit-based TDM requires 70 a modified OLT to send the data in a bit-interleaved fashion and modified ONUs to correctly interpret the incoming data.
Receiver architecture comparison
We consider XG-PON [11] as an example of a packet-
75
based TDM implementation of a PON. A typical XG-PON receiver has an architecture as depicted in Figure 3 .
From this architecture it is apparent that the major drawback of using packet-based TDM is its lack of power efficiency. It is not until the XGEM stage that unneeded 80 data is discarded and the receiver can start operating at the user rate instead of the full line rate. Since power consumption scales with the processing speed, the circuits operating at the full line rate clearly raise the total power consumption. 
BiPON Frame
In BiPON, data is organized in custom BiPON frames having 2 parts: a header and a payload. The header is a fixed-rate (1/256 of the line rate) section used for These settings are used to adjust the receiver hardware to sample the payload bits that are destined for that specific 110 receiver.
Cascaded BiPON -CBI-PON
In BiPON, the bit-interleaving is only applied at the level of the access network, where the leaf nodes are custom
ONUs. We propose a cascaded BiPON (CBI-PON) that 115 goes beyond the borders of the access networks and brings the BiPON concept to the metro/edge network. Figure 6 depicts the cascaded BiPON network architecture.
Alongside the development of the End-ONTs (Optical Network Terminal), serving as the leaf nodes for the access 120 network, the need for repeaters to connect different PON levels arises.
During the development of the CBI-PON, a 3-level PON was assumed in order to retain sane end-user speeds.
125
Even though the development of CBI-PON was done for a 3-level PON, the concept is scalable to any number of levels. Every level has a maximum downstream line rate which is 1/4 of its parent level. The upstream line rate is always 1/4 of the downstream line rate. 
CBI Levels and Devices
The primary level L1 is a metro/edge network that consolidates traffic from the core network, other metro/edge networks and access networks. On this primary level, we find L1 CBI Repeaters (R1) and L1 CBI End-ONTs (N1). 
CBI Frame
As in BiPON, data traffic is organized in a custom frame. Due to the multi-level architecture, this custom frame is more complex than was the case in BiPON. At For a primary level (L1) CBI frame, the payload determined by a particular header lane is either an L2 CBI frame (to be forwarded by an L1 Repeater) or L1 end-user data (to be received by an L1 End-ONT).
165 Similarly, for a secondary level (L2) CBI frame, the payload belonging to a header lane is either an L3 CBI frame (to be forwarded by an L2 Repeater) or L2 end-user data (to be received by an L2 End-ONT).
170
Finally, the payload in an L3 CBI frame lane is always L3 end-user data (to be received by an L3 End-ONT). The CBI frame structure is illustrated in Figure 7 .
175
The frame length is kept constant throughout all CBI-PON levels and equals 125 us. To achieve this constant frame length, the number of header lanes and the payload sizes of L2 and L3 CBI-PON frames scale proportionally with the input line rate. 
40 Gbps CBI-PON Implementation
As a proof-of-concept, a 40 Gbps CBI-PON was implemented. Table 1 shows the line rates chosen for the multiple levels to accommodate a 40 Gbps CBI-PON, and how the number of header lanes and payload sizes vary 185 according to the line rate in order to maintain a 125 us frame length. Figure 8 shows a CBI frame with its L1, L2
and L3 implementations for the 40 Gbps demonstrator.
As mentioned in section 3, we need a CBI Interleaver,
190
CBI Repeaters (L1 and L2) and CBI End-ONTs (L1, L2
and L3) to realize a CBI-PON. In the following paragraphs, the implementation of these devices is discussed. signal. Figure 9 shows the schematic overview of the CBI Interleaver. 
CBI Repeater
As shown in Figure 10 , the CBI Repeater uses a cus- 
CBI End-ONT
The CBI End-ONT uses the same custom IC (CABI- 
CABINET ASIC
As mentioned in the previous paragraphs, a custom
225
ASIC was required to implement the CBI Repeater and End-ONT. Therefore, we developed the CABINET ASIC: a multi-mode, multi-rate chip supporting both repeater and End-ONT mode operation at rates corresponding to the primary (L1), secondary (L2) and tertiary (L3) level.
230
The CABINET IC is composed of two large blocks: (1) a 40 Gbps analog front-end and (2) a CBI frame processing block. In the following sections, we will discuss each of these blocks in more detail. Therefore, the sampling clock should be phase-aligned with the incoming data to assure correct sampling. The clockand-data recovery circuit uses the transitions of the incoming data to achieve this phase-alignment.
40 Gbps Analog
The CDR architecture used is a typical charge pump 245 (CP) CDR using a 2x oversampling bang-bang phase detector (BB-PD) and a second order loop filter [13, 14] . It is however a 1:4 sub-sampling CDR. This means that only 1 out of 4 bits is recovered from a 40 Gbps input stream, because the decimation rate in the CBI protocol is at least 
Sampling stage
The sampling stage of the CDR uses 1:4 sub-samplers. 
Phase Detector Logic Core
In order to sample both a and c in the center of the bit, This gives us enough information to adjust the clock phase. This is summarized in Figure 14 .
8-phase 10 GHz Voltage-Controlled Oscillator
280
Even though only a 10 GHz clock is needed, we need 8 phases of this clock to be able to sub-sample the correct bit 
CBI Frame Processing
In this section, we discuss the digital processing required for the CBI protocol. Once the clock and data have been recovered, the received frame data has to be interpreted. As a first step, common to both CBI Re-310 peater and End-ONT, a synchronization procedure should be followed. Following synchronization, the payload data is treated depending on the receiver being a CBI Repeater or an End-ONT.
Synchronization procedure 315
During the synchronization procedure the header data of the incoming frame is sampled and a search for the SYNC word is performed. Once it has been found, the header lane offset is calculated based on the difference between the received ONU ID and the device's ONU ID. This 320 procedure is shown in Figure 16 . Once synchronization is achieved, the header data can be processed. 
Repeater
In repeater mode, the header data is descrambled to obtain the bandwidth map (BWMAP), which is then parsed 325 to get the necessary parameters about the link rate. The payload of incoming frames is then forwarded to the output without any further processing. For every next frame received, the BWMAP is monitored for changes. This flow is depicted in Figure 17 . 
CABINET ASIC Implementation
The CABINET ASIC as described was implemented in a 40 nm low power CMOS technology. The full layout of the ASIC is shown in Figure 18 , while a picture of the 345 fabricated die is shown in Figure 19 .
Power Consumption Reduction Estimates
Due to some issues with the first prototype of the CAB-INET ASIC, not all modes were completely functional.
However, based on the simulated power consumption and 350 the working parts of the ASIC, the total power consumption of the ASIC could be extrapolated, the results of which are presented in Table 8 .
Considering an L1 Repeater, we find the power estimate to be 187.63 mW. Looking at a 2.5G PONs, the av- More importantly, we can evaluate the power consumption in the network by comparing the state-of-the-art solution with our proposed solution. 
State-of-the-art solution
The state-of-the-art solution is a combination of Ethernet metro/edge networks and GPON access networks.
From the Cisco product brief, we can estimate the power consumption of a 2x10GE line card at 370 W, while a 1GE
375
GPON OLT port consumes about 11 W. Power consumption of a GPON ONU is estimated at 6.5 W. A small access fiber-to-the-home (FTTH) deployment with 768 single family units (SFUs) requires 2 times a 2x10GE line card and, assuming a 32 split, 24 GPON OLT ports and 380 deployment is therefore estimated to be 5996 W.
Proposed solution
Replacing the state-of-the-art solution with a CBI-PON, by using a CBI Interleaver as a line card, a CBI Repeater 385 for every GPON OLT port and replacing the GPON ONTs with CBI EndONTs, we can calculate the power reduction that is to be expected when using a CBI-PON.
The CBI Interleaver power consumption is assumed With these numbers, the total network power consump-400 tion estimate for the CBI-PON becomes 2439 W.
Power reduction estimate
The state-of-the-art solution consumes a total of 5996 W, while the proposed CBI-PON solution only consumes 2439 W, which is an estimated power reduction of around 60%. 
Conclusion
In this paper, we have proposed the 40 Gbps cascaded bit-interleaving PON as an extension on the BiPON introduced in [5, 6] . The CBI Repeater and CBI End-ONT implementations were introduced, together with the multi-410 rate, multi-mode CABINET ASIC implementation. The critical building blocks for the CABINET ASIC were discussed, followed by the CBI frame processing. 
Acknowledgement
