subscriber ONU and a burst-mode receiver (BM-RX), as well as burst-mode clock and data recovery (BM-CDR) at the OLT of the PON system. Among these BM-PMD components, the 10 Gb/s reshaping and reamplifying (2R) BM-RX is the most challenging to design. It requires fast RX settling, to allow for a short burst preamble and hence high interactivity and throughput, combined with a high RX sensitivity and a wide dynamic range, to be compliant with installed optical distribution networks (ODNs) and to support flexible network deployment. In general, PIN-based BM-RXs are less expensive than those based on avalanche photodiodes (APDs). On the other hand, APD-based BM-RXs offer higher RX sensitivity and are able to support the existing ODNs at higher data rates without extra optical amplifiers in the networks, which makes the APD BM-RX the preferable solution for 10G-class PON applications [2] .
Recently both AC-coupled and DC-coupled BM-RXs have been reported for IEEE 802.3av 10G-EPONs [2] [3] [4] [5] [6] . They all employ 64B/66B line coding and allow longer RX settling time. Although the Full Service Access Network Group (FSAN) has only endorsed XG-PON1 with asymmetrical 10/2.5 Gb/s transmission rates so far, the first symmetrical XG-PON with 10/10 Gb/s has been reported in [7] . The BM-RX reported in [7] is an AC-coupled design that still requires a total RX settling time of 360 ns including CDR. A DC-coupled design, however, can achieve a much shorter RX settling time at the PON physical layer [8] . Moreover, multi-rate operation is a very interesting feature for 10G-GPON systems, but it is challenging to design with a short settling time at lower rates because only scrambled non-return-to-zero (NRZ) packets are transmitted without any line coding. This paper is an expanded version of our OFC/NFOEC 2012 paper [9] . In this work, we demonstrated the multi-rate operation of our new 10 Gb/s APD-based 2R BM-RX originally designed for symmetric 10G-GPON systems [10] . The overall excellent RX performance of a short settling time, high sensitivity, and large loud/soft ratio was validated in multi-rate burst-mode operation. The proposed BM-RX can process burst signals at 2.5/5/10 Gb/s simultaneously without needing an external reset signal. 
A. 10 Gb/s BM-TIA
The BM-TIA performs fast gain settling in three steps (high gain, medium gain, and low gain) by means of a variable-gain TIAc core followed by a variable-gain single-ended-todifferential (S2D) circuit. Ref. [11] proposed a 1.25 Gb/s TIA core with three feedback resistors in parallel using two n-type metal-oxide semiconductor field-effect transistor (nMOS) switches. At 10 Gb/s, the parasitics introduced by these additional feedback paths would limit the TIA frontend bandwidth. Another important design specification for the TIA core is the phase margin. In order to keep the TIA stable, a typical design choice is to make the phase margin larger than 45 • , which would result in only 1.2 dB peaking in the frequency domain if a second-order frequency response is assumed. To make sure that the circuit is stable and peaking remains within specifications, Ref. [12] switches the open-loop gain by varying the load resistor at the input stage, but this further complicates the gain-bandwidth tradeoff inside the TIA core because added switching elements introduce extra parasitics. In this design, the second gain switching is done by a load impedance switch in S2D. This implementation is advantageous over the one in [11] , which has two gain switches inside the TIA. The input-referred noise requirement of S2D is less critical than the TIA core as S2D is the second stage in the BM-TIA's amplifier chain. This gives more freedom in the design tradeoff and allows better bandwidth optimization. Furthermore, unlike switching the feedback network in a feedback loop, the gain switching in the S2D stage has an open-loop behavior. Therefore, the stability of the TIA is not an issue, resulting in a reliable and faster response.
When a burst has ended, the BM-TIA is reset to high transimpedance gain (64 dBΩ) for maximum RX sensitivity. When a new incoming burst exceeds a certain power threshold, the transimpedance switches to medium gain (47 dBΩ) and is possibly further reduced to low gain (44 dBΩ), all within a few nanoseconds, under the control of logic signals GS1 and GS2. A dummy TIAd generates the references for both the S2D and the gain switch block. The gain switch block generates the logic signals GS1 and GS2 by comparing the signal in the datapath with the reference from TIAd. Once the gain switching is settled, the TIA transimpedance gain is locked and kept constant during the burst payload to avoid performance degradation due to gain fluctuations. Such fast gain settling (<10 ns) can only be achieved by a BM-TIA using a reset signal to wipe any remembrance of the preceding burst [4] . Figure 2 indicates that the BM-TIA has no reset pin. The reset signal is auto-generated in the subsequent BM-LA and conveyed to the BM-TIA via common-mode signaling [13] . The input stage of the BM-LA alters the common-mode voltage of the BM-TIA output. The BM-TIA extracts an on-chip reset signal out of these common-mode changes.
B. 10 Gb/s BM-LA
The interface between the BM-TIA and BM-LA can be AC-or DC-coupled. A fast-response AC-coupled BM-RX was proposed in [14] . Assuming 8B/10B line coding, it obtained a 75 ns settling time when testing with a pseudorandom bit sequence (PRBS) 2 9 − 1 data pattern. With the 64B/66B line coding specified for IEEE 10G-EPON, it is difficult to achieve such a short settling time with AC-coupled BM-RXs because of the inherent AC-coupling time-constant tradeoff. Recently, an AC-coupled BM-RX with baseline-wander common-mode rejection [5] attained a settling time of 150 ns at the cost of an additional BM-RX sensitivity penalty of 1 dB. To achieve a shorter settling time, DC-coupled interfaces can be applied. Our previous work shows a 10 Gb/s DC-coupled BM-RX with a feed-forward peak-detection-type threshold extraction for long-reach PONs [15, 16] . It achieved a short 2R settling time of 23.8 ns with a BM-RX sensitivity of −15.8 dBm (bit error rate (BER) = 10 −3 ) at a 15.5 dB loud/soft ratio. However, this feed-forward approach is less accurate for decision threshold extraction and has relatively high-power consumption.
In this paper the 10 Gb/s BM-LA is a DC-coupled feedbacktype threshold extraction RX. It implements a fast-offset integrator with two switchable time constants as shown in Fig. 2 . When a new burst arrives, the BM-LA first performs fast-offset compensation and amplitude recovery with a short time constant (∼8 ns). Once the correct threshold is established, the offset integrator switches to a larger time constant (∼400 ns) and enters a slow tracking mode, which is critical to provide a higher tolerance to consecutive identical digits (CIDs) within the payload. The DC-coupled fast-response BM-LA needs a reset signal. The timing requirements of the reset signal and its relationship to the burst input are shown in Fig. 3 . In these experiments, the reset signal can be provided externally by the system or generated internally by the BM-LA chip itself (auto-reset generation). In the latter case the BM-LA itself detects the end of a burst (EoB), resets the decision threshold to an initial state, and waits for the arrival of the next burst. In order to generate the burst activity signal, the activity detection block measures the power of the input signal and compares it with a predefined threshold. As soon as new burst activity is detected on the chip, it quickly enables the offset compensation loop to extract the decision threshold. The data path composed of a chain of amplifiers provides a total of 57 dB gain and the simulated input-referred rms noise is about 0.8 mV.
C. On-Chip Auto-Reset Generation
To ensure high uplink transmission efficiency in PONs, a short inter-burst guard time and settling time are required for BM-RXs. In this case, a reset signal is used to erase all information from the previous burst and prepare the BM-RX for the newly coming burst. As shown in Fig. 4(a) , this reset signal usually originates from the media access control (MAC) layer, which knows the arrival and end times of bursts from all ONUs. Removing this external reset signal greatly simplifies the interface between the physical (PHY) layer and the MAC layer and enables the use of the BM-RX in reamplification, regeneration, and retiming (3R) nodes where no such timing information is available. Thus it is beneficial if the BM-RX itself (so the physical layer) is capable of detecting when the burst signal ends and generating an internal reset signal as shown in Fig. 4(b) . In this case, no time-critical control interfaces cross the barrier between the PHY and TC layers, which is a real advantage in terms of interoperability.
The auto-reset generation has to be designed to accommodate new requirements of NG-PONs where it is mandatory to use a feed-forward error correction (FEC) code when the data rate is at 10 Gb/s [17] . This implies that the auto-reset generation should be tolerant to possible higher BER in the link. Our previous work [15] generated an on-chip reset signal based on measuring the time since the last received 1. This duration is chosen to be sufficiently longer than the maximum CID in order to avoid an untimely reset signal during the payload. However, in this way the auto-reset generation would suffer from noise, especially at the high pre-FEC BER case. A better algorithm has been proposed in [18] ; instead of restarting the counter after a 1-bit fault, we could tolerate a number f faults before the counter restarts again. The idea is to generate a reset signal after receiving n consecutive bits, including a maximum of f 1 bits for the high BER case.
The implementation of the on-chip auto-reset generation is shown in Fig. 5 , using a clock counter and a data counter. Every time that the data counter exceeds its threshold (f ), the EoB detection is restarted. On-chip auto-reset is generated after the clock counter exceeds the clock threshold (n). In this case the probability of missing the EoB is [18] 
where P EoB (k) is the probability of missing the EoB within k bits after the end of the preceding burst, and P n is the probability of receiving n consecutive 0 bits during the guard time. P r (k) is the probability that the counter is restarted after exactly k bits, which happens when the kth bit is detected as a logic 1 and the EoB was not detected before. Taking into account the time needed to restart the counter, expressed as r bit periods, P r (k) is equal to
where N r is the number of times the counter restarts during the EoB detection, N1 = k n+r−1 , and N2 = k f +r . Div(k, pl) is thus the number of possible divisions of k bits over pl places (pl ≥ 1) with f 1s and maximally n − 1 bits in each place and is given by
where C f k is the number of f -combinations from a given set of
As shown in Eqs. (1), (2) , and (3), the probability of missing the EoB is strongly dependent on n and f. For instance, to avoid missing the EoB, we usually design the EoB detection circuit with the probability P EoB below a given threshold, e.g., 10 −13 . In order to achieve this target P EoB with the same time interval and 1-bit fault allowed (i.e., f = 1), we have to reduce the value of n by half (from 301 to 155) if the pre-FEC BER is increasing from 10 −8 to 10 −4 . Therefore, both the clock counter (n) and data counter (f ) were designed to be fully programmable, which makes the auto-reset generation flexible and robust to use for different BER scenarios. 
D. Multi-rate Operation
An OLT optionally supporting multi-rate operation can be advantageous because of backward compatibility and a possible smooth upgrade path from existing PON networks. In 10G-EPON upstream, the BM-RXs operate in dual-rate 1G/10G burst-mode reception mode for 8B/10B-coded 1.25 Gb/s and 64B/66B-coded 10.3125 Gb/s bursts [17] . The specified line code forces the maximum CID to have a similar length in time for both 1G and 10G upstream bursts, which relaxes the time-constant design tradeoff if the AC-coupling method is used [18] . However, the line code used in 10G-EPON contributes additional overhead (20% at 1G and 3% at 10G) and reduces the upstream transmission efficiency. Furthermore, the dual-rate BM-RX in [5, 6] uses two BM-LAs: one for 1G and another for 10G, which increases the RX complexity and cost.
To improve the upstream efficiency and reduce the cost, it is desirable to allow multi-rate operation for only scrambled NRZ data with a single BM-RX. The main technical challenge is to keep the short settling time of the BM-RX while tolerating a very long sequence of CIDs because of the multi-rate operation. For example, the 1-bit duration of 2.5 Gb/s data is 4 times the bit duration of 10 Gb/s data, which implies that the possible maximum CID length at 2.5 Gb/s is 4 times longer as well. Because the proposed BM-RX has two operation modes with different time constants in the offset compensation loop, it can achieve simultaneously a fast response and a large CID tolerance. Assuming the maximum CID of the scrambled NRZ data is about 72 bits, the BM-RX needs to handle 7.2 ns long CIDs at 10 Gb/s but more than 28.8 ns for 2.5 Gb/s data. In this implementation, the offset integrator was designed to have a large time constant during payload to allow at least 51.2 ns CIDs without significant output jitter deterioration, which makes it suitable for multi-rate operation without extra line coding overhead.
III. INTEGRATION OF THE 10 Gb/s BM-PMD COMPONENTS AND EXPERIMENTAL RESULTS
New 10 Gb/s BM-TIA and BM-LA components, fabricated in a 0.13 µm SiGe bipolar complementary metal-oxide semiconductor (BiCMOS) process, were integrated as shown in Fig. 6 . To emulate the uplink PON system, two 1.3 µm BM-TXs alternately send 10 Gb/s bursts upstream. BM-TX1 is based on an electro-absorption modulator laser (EML) and has an output power of +4.4 dBm and an extinction ratio (ER) of 10 dB. BM-TX2 is a transmitter optical subassembly (TOSA) based on a distributed-feedback laser and has an output power of −0.8 dBm with an ER of 7 dB. The two TX outputs are combined by a 2 × 2 splitter and fed to the BM-RX (Fig. 7) . Although the 2R BM-RX is a DC-coupled feedback-type receiver for fast response, it can be used with AC-or DC-coupled BM-CDRs. During the course of the experiment, only an AC-coupled BM-CDR board was available. Therefore, the 2R BM-RX is AC coupled to the BM-CDR providing clock and data recovery.
We first evaluated the BM-RX on its own, with an external reset signal. The 10 Gb/s burst packets consisted of a 76.8 ns preamble and a 1280 ns payload. The guard time between bursts was set to 25.6 ns. The payload was composed of a NRZ 2 31 − 1 PRBS data pattern plus CID patterns with 72 bits of successive 1s and 0s, respectively. The APD multiplication factor M was set to 9. Figure 8 shows the BER curves measured with an external reset signal. The measured input sensitivity of the BM-RX at a pre-FEC BER of 10 −3 was −31.9 dBm with a static optical power level (BM-static). With two branches of BM-TXs, the BM-RX sensitivity measured on the weak packet emitted by BM-TX1 was −31.3 dBm for the worst case when the input optical power at the RX from BM-TX2 equals −6 dBm. To evaluate the full dynamic range, we performed the overload measurement in the case where the BER was measured on the bursts with high power, preceded by bursts with the lowest input power. The error-free input overload level was found to be higher than −5 dBm. This yields a dynamic range of more than 26.3 dB. We also tested the BM-RX with various payload lengths (up to 1 ms) and the result was stable and did not change with the burst length used. The BM-RX was also assessed for different loud/soft ratios, and the measured BM-RX penalties due to the preceding loud burst are shown in Fig. 9 . The maximum BM penalty was only 0.6 dB at a loud/soft ratio of 25.3 dB. Figure 10 shows the measured BERs using the BM-RX with the BM-CDR. The phase difference between subsequent bursts was varied by changing the delay in the pattern generator. Loud soft ratio (dB) The employed BM-CDR is a fast-lock PLL-based CDR, which can acquire and lock to the burst-mode 2R signal within 80 ns at room temperature for the worst-case phase offset [19] . The total preamble for 3R BM-RX was increased up to 150 ns to accommodate the CDR settling time. The input sensitivity at a pre-FEC BER of 10 −3 remains unchanged when the BM-RX is followed by the BM-CDR regardless of the input phase difference. Therefore, the BM-CDR is fully settled within the preamble and the BM-RX output already provides an almost ideal 2R-regenerated signal to the CDR.
We finally evaluated the BM-RX at different upstream data rates with on-chip auto-reset generation. In this experiment, the BM-TX2 operated at a data rate of 10 Gb/s while the BM-TX1 was sending bursts at 2.5, 5, and 10 Gb/s, respectively. For auto-reset and multiple-rate operation, the guard time and settling time were increased to 100 ns and 150 ns, respectively. The guard time up to 1 ms was also tested with on-chip auto-reset and the BM-RX works properly. The payload remains a NRZ 2 31 − 1 PRBS pattern plus CID patterns with 72 bits of 1s and 0s, respectively, for different data rates. The measured BER curves are shown in Fig. 11 . In those experiments, the digital elements inside the EoB detection block were not disabled. At 10 Gb/s, the RX sensitivity penalty from using on-chip auto-reset generation instead of an external reset signal is limited to 0.3 dB at a pre-FEC BER of 10 −3 . For 2.5 Gb/s operation we assume there is no strong FEC, and therefore the BER threshold was set to 10 −10 . At this BER threshold the sensitivity penalty of using on-chip auto-reset is negligible for 2.5 Gb/s operation. With auto-reset generation, an upward deviation of the BER curve is found when the input power is lower than −31 dBm. This is attributed to the fact that input signals less than −31 dBm are too weak for the on-chip auto-reset circuit in our current design, which targets supporting Nominal2 Class (N2 Class) optics defined for the XG-PON [20] . A detailed summary and comparison of the BM-RX performance is outlined in Table I. IV. CONCLUSION An advanced DC-coupled BM-RX with on-chip auto-reset generation and multi-data-rate support was developed for symmetrical 10G-GPON systems. The experiment validated an excellent RX sensitivity of −31.3 dBm for weak/strong bursts at 10 Gb/s, with a very short 2R RX settling time of 75 ns. Moreover, to achieve high interoperability and backward compatibility, auto-reset generation and multi-rate operation has been investigated and incorporated into the newly developed BM-RX. We demonstrated experimentally that the BM-RX can process burst signals at 2.5/5/10 Gb/s simultaneously without needing an external reset signal. It shows great potential for use in emerging symmetric 10G-GPON systems with advantageous features. 
