Abstract-In this paper, we present a clock recovery system implemented on field programmable gate array and integrated to the Gigabit Ethernet media converter for PMMA SI-POF developed within the framework of the POF-PLUS EU Project. We demonstrate timing synchronizing using only one sample per symbol from a highly distorted and attenuated 2-PAM signal without requiring any sort of preequalization. This is achieved by means of a hybrid analog-digital PLL with a timing error detector based on a modified version of the Müller and Mueller algorithm, a loop filter, and a VCXO.
Hybrid Clock Recovery for a Gigabit POF Transceiver Implemented on FPGA Julio Ramírez, Antonino Nespola, Stefano Straullu, Paolo Savio, Silvio Abrate, Member, IEEE, and Roberto Gaudino, Senior Member, IEEE Abstract-In this paper, we present a clock recovery system implemented on field programmable gate array and integrated to the Gigabit Ethernet media converter for PMMA SI-POF developed within the framework of the POF-PLUS EU Project. We demonstrate timing synchronizing using only one sample per symbol from a highly distorted and attenuated 2-PAM signal without requiring any sort of preequalization. This is achieved by means of a hybrid analog-digital PLL with a timing error detector based on a modified version of the Müller and Mueller algorithm, a loop filter, and a VCXO.
Index Terms-Clock recovery (CR), DSP, field programmable gate array (FPGA), gigabit ethernet, optical communications, phase-locked loop (PLL), polymer optical fiber, timing error detector (TED).
I. INTRODUCTION

I
N recent years, the increasing demand for bandwidth has driven the demand for higher performance. As a consequence, traditional communication solutions have been dramatically improved and significant research efforts have led to the creation of new technologies capable to cope with everincreasing requirements. As a part of this trend, European Telecom Operators, together with the European Union, have been actively working and creating policies to bring broadband access to the European continent. In this sense, the European Seventh Framework Program 7 (FP7) hosted the POF-PLUS Project [1] , an initiative aimed to promote research and development of short-range optical communication solutions based on plastic optical fiber (POF) to provide wired and wireless services for in-building/in-home networks and to investigate the feasibility of optical interconnects applications. As reported in [2] and [3] , this initiative led to the implementation on a field programmable gate array (FPGA) of a Gigabit Ethernet media converter in full compliance with the IEEE 802.3 Ethernet Standard and capable of overcoming the impairments introduced by the POF channel. As shown in [4] , the most critical issue in POF transmission schemes is to overcome the severe limitations in terms of available channel bandwidth.
In fact, the electrical to electrical available 6-dB bandwidth (from the electrical input of the transmitter to the electrical output of the photodiode) is below 100 MHz, while the media converter transmits above 1 Gbit/s. As a result, the received eye diagram is completely closed due to intersymbol interference. On top of this, due to fiber attenuation, the received signal after the POF target length is very small, so the signal to noise ratio is also very small. The key elements of the proposed architecture are thus as follows.
r A highly optimized equalization algorithm to overcome intersymbol interference (explained in detail in [2] ).
r Forward error correction (FEC) in the form of a (255, 237)
Reed-Solomon (RS) code [2] .
r A clock recovery system based on a properly optimized phase-locked loop (PLL), thereby able to recover synchronism with a completely closed eye diagram. As stated in [3] , the first versions of the transceiver did not include a clock recovery (CR) system, therefore, in order to test and debug the proposed architecture, it was necessary to bypass the clock between the transmitting and receiving nodes. In summary, the system was able r to perform 2-pulse amplitude modulation (PAM), resonant cavity light emitting diode (RC-LED)-based transmission over 50+ m of standard A4a.2 1 mm poly-methylmethacrylate step-index plastic optical fiber (PMMA SI-POF) with a high optical power margin of 4 dB; r to run real traffic, implementing a complete media converter between standard Gigabit Ethernet 1000Base-T and the PMMA SI-POF. After having successfully validated the operation of this first prototype, we proceeded to complete it by implementing the required timing recovery system.
In this paper, we describe the chosen CR architecture and its hardware implementation on FPGA. In particular, we demonstrate the timing recovery capabilities of the system for continuous-mode data transmission based on 2-PAM signals without requiring preequalizing schemes and achieving the full functionality of the previously validated media converter. Next section will present the designing process of the system.
II. TIMING RECOVERY SYSTEM
A. POF Channel
The first step of the design process consists of obtaining an expression to model the impairments inflicted by the POF channel on the received signal.
Accordingly, the analysis starts by proposing the block diagram shown in Fig. 1 , from which the overall transfer function 0733-8724 © 2013 IEEE for the POF channel can be derived as
( 1) where H P O F (ω) is modeled as a linear time invariant (LTI) low-pass filter, while H T X (ω) and H RX (ω) correspond to the theoretical transfer functions of the 2-PAM transmitter plus RC-LED and the optoelectronic receiver PD, respectively. Once the channel is modeled, the signal at the output of the optoelectronic receiver Y R (t) can be expressed as
where x n denotes the transmitted 2-PAM symbols, v(t) is the inherent additive colored Gaussian noise introduced during the optoelectronic conversion, and εT is the unknown fractional time delay between transmitter and the receiver (−1/2 < ε <1/2). In order to maximize noise immunity, Y R (t) must be sampled at instants of maximum eye opening, referred as optimum sampling instants; their individuation implies adjusting the phase of the sampling clock according to εT . For this purpose, the receiver must contain a clock synchronizer, which is a device that makes the estimationε of the mentioned delay [5] .
There are two main types of clock synchronizers, which are categorized, depending on their architecture schemes, as feed forward and feedback synchronizers, the latter also referred as error-tracking synchronizers [5] . Further classifications based on other criteria can be made. For instance, if the synchronizer relies on decided symbols to produce a timing estimate, then it is defined as decision directed, otherwise, it is nondata aided [5] . Moreover, it can be further categorized depending on its operation domain, i.e., analog or digital, as being a continuous or discrete time system; and depending on the data transmission mode as being a burst or continuous mode clock synchronizer.
For the present case, it was decided to implement a continuous-mode and nondata-aided error-tracking synchronizer by means of a hybrid analog-digital architecture.
B. Hybrid Synchronizer Architecture
The general diagram of the resulting system is depicted in Fig. 2 . As seen, the incoming symbols transmitted at a line rate of 1.0991 Gbps (nominally 1.1 Gbps) are sampled by the on board analog to digital converter (ADC), which operates in double data rate (DDR) mode, then the samples are forwarded to the timing error detector (TED), based on the Müller and Mueller (M&M) algorithm [6] , then the error signal is averaged by the loop filter, and finally, is converted to the analog domain by a Δ-Σ modulator followed by an RC filter, which together operate as DAC [7] and drive the VCO. It is, hence, evident that error-tracking synchronizers apply the PLL concept to derive a sampling clock from the received signal [5] .
Furthermore, it should be noticed that in order to maximize flexibility and ease eventual upgrades, it was decided to implement most of the system inside the FPGA, so that the scalability of our architecture toward higher bit rates is mostly limited by the analog devices present in the loop, and in particular, by the capabilities of the ADC converter. Due to the architecture shown in Fig. 2 , this is the "fastest" and the most critical circuit required since, inside the FPGA, all the subsequent signal processing is done using highly parallelized algorithms. In fact, the used FPGA clock is 275 MHz, significantly smaller than the bit rate.
In the following, the design and implementation of each block composing this hybrid architecture is presented.
C. Müller and Mueller TED
As aforementioned, the M&M TED is implemented according to the timing recovering methods proposed in [6] . Typically defined as a decision-directed synchronizer [5] , its conventional implementation diagram, as part of a clock recovery system, is shown in Fig. 3 .
As seen, this device derives the delayεT by estimating the error e k between the equalized PAM signal Y S (k) and the decided symbols a k so that assuming sample times t = kT, we have that the error for the kth symbol is expressed as [6] 
As it can be noticed, the diagram shown in Fig. 3 differs from the architecture proposed in Fig. 2 in the way in which e k is derived so that instead of estimating the phase error based on the preequalized signal Y S (k) and its corresponding decided symbol, the M&M TED is modified in such a way that it requires only the unequalized received samples Y R (k). The redefined M&M algorithm can be expressed as
where sign k denotes the sign of the kth sample of the received signal Y R (k).
The M&M TED, as defined by (4), was first implemented in the FPGA and then validated and parameterized. As part of the parameterization process, it is of the utmost importance to determine the sensitivity K d of the TED, because it is required to design the Loop Filter. In order to retrieve its value it is necessary to derive the S-Curve, which results from plotting the estimated phase errorε (measured at the output of the TED operating in open loop) versus the actual phase difference.
The setup used to validate the system included a pseudorandom binary sequence (PRBS) generator that allowed delaying the transmitted signal with respect to the transmitting clock. For the experiment, different length PRBS sequences presenting different phase delays were transmitted, then filtered by an RC filter similar to the POF channel and finally forwarded to the TED. The resulting S-Curve is shown in Fig. 4 . In order to estimate K d it is necessary to determine the slope of the curve in the vicinity of the zero crossing point. For this particular case, this analysis yielded a value for K d of 0.35 V/rad.
D. Loop Filter and Closed-Loop Analysis
This section describes the mathematical model used for designing the loop filter. The analysis is performed using the analog model of a PLL shown in Fig. 5 . The transfer function of the analog PLL is expressed as [5] 
where θ o and θ i represent the phase of the VCO and of the incoming signal respectively, F(s) is the transfer function of the loop filter, and K o is the gain of the VCO. For the purposes of this project, it was decided to implement a second-order PLL capable of tracking the phase and frequency deviations of the incoming signal with respect to the clock generated by the VCO. Such a device is obtained by designing the loop filter under the form of an integrator. Accordingly, the resulting transfer function of the loop filter can be expressed as [5] 
where K 1 = τ 2 /τ 1 , and K 2 = 1/τ 1 , and τ 1 , τ 2 are the time constants RC of the filter. Now, by substituting (6) into (5), the transfer function of the PLL becomes
from which the loop gain can be derived as
Equivalently, (7) can be expressed in terms of the natural frequency ω n and damping factor ζ as
where
Equations (8) and (9) are used to design the PLL. The timeresponse of the system is directly proportional to ζ, and therefore, its value is critical for guaranteeing the system stability [5] , [7] . Usually, ζ is defined as 0.707, while the rest of variables are defined accordingly. A commercial voltage controlled crystal oscillator (VCXO) SI550 from Silicon Labs was used to generate the sampling clock. According to laboratory tests this device presents a gain K o of 99 kHz/V, while, as aforementioned, the M&M TED presented a gain K d of 0.35 V/rad. Finally, the required ω n for the system was chosen as 4 kHz. Once these parameters were defined, we proceeded to design and implement the loop filter.
E. Digital Transformation and FPGA Implementation of the Loop Filter
In order to implement the loop filter in the FPGA, it is necessary to transform it from the analogue domain F(s) into the digital domain F(z). This is achieved by means of a bilinear transformation, which basically maps the left side of the splane into the unit circle of the z-plane, thus guaranteeing that any stable system in the analogue domain is transformed into a stable digital system. The bilinear transformation is defined as [7] 
where T S is the sampling period. The resulting loop filter architecture was implemented as shown in Fig. 6 . The gain constants K 1 and K 2 were defined taking into account the previously defined values ω n , K d , and K o and, as depicted in Fig. 6 , were implemented using shift registers instead of the more bulk and slower multipliers. In the end, the coefficients resulted in K 1 = 0.00001994 and K 2 = 0.5569 and, as seen, they were approximated by a binary division as 2 −1 and 2 −13 , respectively.
F. Digital to Analog Conversion
The DAC was implemented in correspondence with the Xilinx application note number XAPP154 [8] . This document describes its implementation using a Δ-Σ modulator, for which provides a template programmed in Verilog, and also the schematic for implementing the corresponding RC passive filter. The top level diagram of the system is shown in Fig. 7 .
As depicted, the code provided implements a modulator that operates with a clock at 100 MHz; therefore, it was necessary to modify the code to operate at 275 MHz (FPGA Clock). Also different values of resistance and capacitance were chosen to implement the external passive filter. For more details regarding this device, the mentioned application note should be consulted. After having described the CR system, we will now proceed to present the results obtained from its validation process and from its operation as part of the fully engineered 1 Gbit/s media converter. 
III. EXPERIMENTAL RESULTS AND DISCUSSION
A. Testing the Clock Recovery System
A series of tests were performed to parameterize the CR system. The first experiments were aimed to measure the holding window of the system, which is the range of frequencies for which the system is able to lock the clock, and also the jitter throughout this holding window. The experimental setup is shown in Fig. 8 .
As detailed, it consisted on a PRBS generator, the optoelectronic transmitter (RCLED), 50 m of PMMA SI-POF and an opto-electronic converter (A3PICs), the Media Converter (CR+EQ) implemented inside the FPGA, the external VCXO, a bit error rate (BER) tester and a real time oscilloscope. The experiment was performed as follows: first a clock frequency near to the target one was fixed, different length PRBS sequences were transmitted, the media converter was restarted and if the system was able to lock the clock and operate error free, then the frequency was considered within the holding window, and the jitter was measured directly from the eye diagram on the oscilloscope. Table I lists the results obtained; in particular, it should be noticed that the line rate of the system is 1.0991 Gbps which corresponds to a symbol period of 0.91 ns.
Once the holding window was experimentally delimited, the convergence time was measured. To this end, a flag signal was generated inside the FPGA; when the error estimated by the M&M TED was bounded within certain values that indicated a state of convergence, the flag was enabled. The resulting curve obtained with a PRBS with length 2 23 − 1 is shown in Fig. 9 , where it is evident that the convergence time is directly proportional to the frequency deviation. The fact that the frequencies are negative is just a matter of nomenclature, because the reference and starting scanning point of the tracking algorithm is set to the far right limit of the holding window. Moreover, these results show that the convergence time for the chosen operating frequency is 55 ms. It should be noticed that this time can be reduced by either moving the operating frequency toward the starting scanning point or by start scanning in the vicinity of the operating frequency.
B. Fully Engineered 1 Gbit/s Media Converter
A final test to validate the operation of the fully engineered 1 Gbit/s media converter (including the clock recovery system) with 50 m of PMMA SI-POF was performed. The experimental setup is shown in Fig. 10 . The test consisted on the full-duplex transmission of real arbitrary traffic generated by means of an Agilent N2X Router Tester. The Router Tester allowed to measure the overall delay of the system as <30 μs. Moreover, the media converter presented an error-free operation for the transmission without extra-attenuation at the receiver, i.e., with a received optical power of −9.5 dB. Finally, the curve of the BER as function of the received optical power is shown in Fig. 11 . As seen, the system guarantees a total power margin of 4 dB before FEC, which means that the inclusion of the CR system to the media converter does not imply a penalty in terms of power optical margin when compared with the results reported in [3] . This is the most important result of the project. 
IV. CONCLUSION
This paper presented the design and implementation of a continuous-mode and nondata-aided CR system for a 1 Gbit/s media converter for PMMA SI-POF applications. In particular, it has been demonstrated how a hybrid digital-analog PLL based on a modified M&M TED is capable of r recovering synchronism from a highly distorted and attenuated 2-PAM signal without requiring any sort of preequalization and without incurring in a penalty in terms of optical power margin; r achieving synchronism in 55 ms and tracking clock frequency variations while maintaining low jitter operation. In general, it is also stated that through the obtained results the achievement of a fully engineered 1 Gbit/s media converter in full compliance with the IEEE 802.3 Gigabit Ethernet standard, and thus, the complete fulfillment, for which respected to the partnership POLITO-ISMB, of the main objectives of the POF-PLUS EU Project.
