In this paper, we experimentally demonstrate the transmission of 50-Gb/s fourlevel pulse-amplitude modulation (PAM4) signal over 50-km standard single mode fiber (SSMF) in C-band using electroabsorption modulated laser and PIN/APD detector with 8.5/6 GHz bandwidth. A simplified Volterra-DFE filter is proposed, which reduces the required tap number by ∼70% compared with traditional DFE and Volterra filtering. Experimental results show that by using 86-tap equalization, a sensitivity of -15 and -14 dBm can be achieved at the FEC limit of 1 × 10 −3 in back to back (BtB) and 50-km SSMF transmission cases. In addition, by replacing the PIN-TIA with a 10-Gb/s APD-ROSA for detection, the sensitivity can be further increased by 5.5 and 3 dB for BtB and 50-km SSMF transmission cases, respectively.
Introduction
Driven by the increasing bandwidth of wired access and wireless access, a large amount of interconnection capacity is required for data transmission inside and between data centers (DCs) [1] . Unlike the long haul transmission, the medium or short-reach optical transmission between DCs are more sensitive to the cost, therefore how to improve network capacity with reduced cost has attracted a lot of attentions. Advanced modulation formats with higher spectra efficiency such as pulse amplitude modulation (PAM), discrete multi-tone modulation (DMT) and carrier-less amplitude and phase (CAP) modulation are being reported in recent researches [2] - [4] . Among these modulation formats, PAM4 offers high spectral efficiency with the lowest complexity and simple digital signal processing, thereby arousing great interests [5] - [8] . However, the major restrictions are the strong inter-symbol-interface (ISI) induced by insufficient bandwidth, and distortions caused by chromatic dispersion and optics like square law direct detection. To further increase the bandwidth efficiency and enable high data rate modulation on low-bandwidth devices, digital equalization techniques such as VF [9] - [11] , feed forward equalizer and decision feedback equalizer (FFE/DFE) [12] and maximum likelihood sequence estimation (MLSE) [13] are employed. Transmission of 56-Gb/s PAM-4 over 26. 4 km SSMF has been demonstrated using 23-GHz MZM and PIN-TIA. Transmitterside pre-compensation and 24-tap FFE/4-tap DFE combined with MLSE at the receiver side are employed for equalization [14] . 50-km transmission of 50-Gb/s PAM4 signal has been demonstrated using 10-GHz DML and PIN in [15] , a filter with 101 taps is used for pre-equalization and finally −6 dBm is reached in the BER of 4.7 × 10 −3 by using pruned VF equalizer with 275 taps. In [16] , 2 × 56-Gb/s PAM4 signal transmission over 100 km SSMF with two 18-GHz DML and 40-GHz PD is demonstrated. The FEC limit is reached after VF equalizer with the memory length of (51, 17, 13), corresponding to 659 taps.
For enabling practical application, efforts have been made to reduce the complexity for signal equalization [17] . In this paper, we demonstrate a 10G-class optics based 50-Gb/s PAM4 transmission system using EML as transmitter. A simplified 3-order decision-feedback assisted VF equalization algorithm is proposed. Compared with traditional 1-order DFE and FFE-based VF, the tap number is decreased from ∼296 to 107 by using Volterra-DFE filtering. By abandoning the inessential high-order taps, the tap number can be further reduced to 86 without introducing sensitivity penalty. As a result, transmission over 50-km SSMF in C-band is realized with a sensitivity of −14/−17 dBm for 10-G PIN-TIA/APD-ROSA detection case respectively.
Experimental Setup
The experimental setup for evaluating the equalization performance in 50-Gb/s PAM4 transmission system is shown in Fig. 1 . At the transmitter side, the pseudo-random binary sequence (PRBS) with a word length of 2 15 − 1 is mapped into PAM4 format and then sent to an arbitrary waveform generator (AWG) for digital to analog conversion. The sampling rate of the AWG is 64 GS/s and the symbol rate is set at 25 GB. The peak-to-peak voltage of the signal is amplified to 2 V before applied to the EML. The EML operates at ∼1543 nm and has a maximal output power of 5 dBm. After 25/50-km SSMF transmission, the optical signal is first attenuated by a variable optical attenuator (VOA) for power control, and then the optical signal is detected by a 10-G PIN-TIA. Inset (e) shows the measured frequency response of the transceivers. The 10-G EML and PIN-TIA has a combined 3/10-dB bandwidth of ∼8.5/12GHz. The detected signal is then sampled by an 80-GSa/s Digital Storage Oscilloscope (DSO) with 25-GHz bandwidth to obtain the sampled data for offline analysis. Inset (a) shows the eye diagram of the 50-Gb/s electrical PAM4 signal emitting from the AWG without optoelectronic and electro-optical devices, in which case the eyes are clearly visible. Inset (b) is the eye diagram of the 50-Gb/s PAM4 electrical signal passing through the BtB optics. Due to the bandwidth limitations of 10-G EML and PIN, the received signal is roughly converted into a 7-level duobinary-PAM4 signal. After 25/50-km C-band SSMF transmission, the eye diagram is further distorted by the dispersion as inset (c) and (d) shows.
Offline DSP is executed by Matlab program. The sampled data from the DSO is first resampled to an integer multiple of the symbol rate. Then the optimal sampling point is obtained based on the absolute value timing symbol recovery algorithm [18] . After that, the bit error rate (BER) performance of three kind equalization schemes, including MLSE, FFE/DFE and VF equalizer are compared. The traditional FFE/DFE with M forward taps and N feedback taps can be described as follows:
Where a(k) and b(k) are feed-forward tap coefficients and feed-back tap coefficients respectively. The DFE filter will be degenerated to FFE if there is no feedback part. When taking the high order interaction into consideration, the FFE filter becomes a VF:
. . .
Where M p represents the memory length of the p-th order VF and a(k 1 , . . . , k p ) denotes the p-th order kernels of the VF. When compensating nonlinear distortions VF shows better performance than FFE. The 1st order kernels of VF can be used to mitigate the linear impairments, the 2nd order kernels of VF can be used to deal with the second order nonlinearity resulting from the modulation and demodulation of optical devices. The main signal-to-signal beating noise (SSBN) interference caused by the square-law detection [19] , [20] can be written as:
Where V DD (n) is the detected signal, E carrier is the carrier signal, E 0 (n) is PAM4 signal and |E 0 (n)| 2 is SSBN. Considering the serious ISI caused by the bandwidth limitation in our system, |E 0 (n)| 2 contains much more interference terms than in traditional cases. The 3rd order kernels of VF can be used to deal with fiber nonlinearities [21] . Thence, kernels of different orders are used to compensate for different interferences, which are generally independent from each other. However, for compensating the ISI caused by limited bandwidth of the optoelectronic devices, DFE generally has better performance than FFE [18] . Therefore, we propose to combine DFE with VF to enhance the equalization performance. The Volterra-DFE filter can be expressed as:
N p and b(k 1 , . . . , k p ) in Equation (4) represent the memory length and kernel of the p-th order Volterra-DFE filter respectively. As Fig. 2 shows, the blue line u N represents the input and decision feedback signals, d represents the current output of the decision device. The VF-DFE is depicted as the red lines, including both FFE and DFE based VF.
In this paper, we use 3-order equalizing in both Volterra-FFE and Volterra-DFE sections. Furthermore, to reduce the complexity of traditional VF equalizer, we simplified the VF algorithm by abandoning the insignificant kernels, and only use the square and cubic terms in the 2nd and 3rd order, respectively. Since the requirement of linearity of PAM4 is higher than that of NRZ, the number of the 2nd−order taps in VF equalizer will be relatively large to compensate for the second-order nonlinearities. The main idea of the optimization is to ensure that the complexity of the equalizer is reduced while maintaining stable performance. Then in the latter optimization process, we will focus on the 2nd−order optimization and gradually determine the other orders step by step in the simplified case and ensure that the complexity of the equalizer is reduced while maintaining stable performance. The performance and complexity of simplified VF equalizer and traditional VF equalizer are compared. The performance comparison with FFE/DFE and the MLSE based on Viterbi algorithm (VA) is also described in the following chapter. In the following sections we will use M and N to represent the feed forward tap and feedback tap of the FFE/DFE, where M 1 , M 2 , M 3 and N 1 , N 2 , N 3 denote the forward and feedback first-order, second-order and third-order memory lengths of the VF-DFE equalizer, respectively.
Experimental Results

Traditional FFE/DFE and MLSE
Considering the cost of actual system, the computing complexity of the DSP should be as small as possible with guaranteed BER performance. For bandwidth-limited NRZ signal, although MLSE shows better performance, the recommended equalization is the combination of FFE/DFE in [18] . Similarly, Fig. 3 shows the BER curves of the 50-km transmitted 50-Gb/s PAM4 signal received at −14-dBm optical power using (a) FFE/DFE and (b) MLSE for equalization. The black curve in Fig. 3(a) shows the results when only FFE is used and the FEC threshold at BER of 1 × 10 −3 cannot be achieved even when the number of FFE taps reaches 400. However, when we add several feedback taps into the equalizer, the BER performance is obviously improved. FEC threshold can be achieved using 300 FFE taps and 5 DFE taps. Further increase the number of DFE taps decreases the required FFE-taps as the blue and green curves depicted. Fig. 3(b) shows the decoding performance of MLSE. In our previous work [22] , we have used MLSE/FFE+MLSE to transmit 50-Gb/s PAM4 signal over 10km and achieved good performance. However, for much longer distances beyond 25km, the desired performance cannot be achieved using only MLSE or adding 155 taps FFE for pre-equalization even when the memory depth of MLSE is increased to 8. 
Volterra FFE/DFE
PIN Receiver:
In this paper, 3-order Volterra-DFE filter is used. Similar with 1-order FFE case, by adding decision feedback section into the filter, the equalization performance can be significantly improved. Fig. 4 depicts the results when only Volterra FFE is used, where the FEC threshold at BER of 1 × 10 −3 cannot be achieved even when M 2 is more than 16 (about 179 taps). However, when several feedback taps are added into the VF equalizer, the BER performance is obviously improved. FEC threshold can be achieved when M 2 = 13 and N = [3, 3, 1] by using 107 FFE taps and 7 DFE taps. The required Volterra FFE-taps can be further decreased by increasing the number of DFE taps as the red and black curves depicted.
Compared with 1-order FFE/DFE equalizer, VF introduces 2nd and 3rd order terms, which increases the computing complexity significantly. For algorithm simplification, we analyzed the distribution of the significant kernels and abandoned the insignificant ones. We firstly set the memory length of 1st, 2nd and 3rd order of VF FFE/DFE as M = [15, 10, 1] and N = [7, 7, 1] respectively. Recursive least square (RLS) algorithm is employed for tap-training. The mean square error (MSE) converged after 200 symbols-training and the coefficients of the 2nd order kernels of FFE and DFE sections are depicted in three-dimensional format in Fig. 5 . Fig. 5(a) shows the second-order feed forward VF kernel when M 2 = 10. The size of the tap matrix is 10 × 10. As the matrix is symmetrical, only half kernels are required. Fig. 5(b) depicts the absolute value of the 2nd order feedback tap coefficients when N 2 = 7. Unlike the FFE-part, only the taps on the diagonal corresponding to the square terms have a high coefficient value for the DFE-section, so the insignificant kernels can be abandoned for complexity reduction. After simplification, we can greatly reduce the 2nd−order feedback taps. However, the 2nd−order forward tap still occupies a large proportion and has a considerable optimization space. According to the VF principle in the second section, different orders of the equalizer have different compensation effects on the signals and they are independent of each other. For complexity reduction, we firstly evaluated how the tap numbers in each order influence the equalization performance. According to the analysis above, it is free to choose which order to start from during the simplification process. We performed the evaluation in the order of the 3rd order, then the 1st order FFE, the 1st and 2nd order DFE, and finally the 2nd order FFE. This result might not be the best, but considering the optimization method is simple and has achieved good performance, we finally chose this method.
Then we evaluated the optimal tap numbers for each order taking both complexity and performance into consideration. Usually the 1st and 2nd order of VF equalizer is used to compensate for dispersion, ISI and second order nonlinearity caused by SSMF transmission and optical devices, while the 3rd order is for high order nonlinearity compensation [16] . We first analyzed the required tap number of the cubic term. As Fig. 6(a) shows, the number of 1st order FFE is M 1 = 15 and the 1st and 2nd order of DFE are N 1 = 3 and N 2 = 3 respectively. The data is the 50-km transmitted signal sampled at −14-dBm received power. Fig. 6(a) shows the BER performance of the equalized signal versus second-order VF-FFE when M 3 and N 3 in the 3rd order is 0, 1, 3 and 5. The black line represents the performance without 3rd order, and the FEC threshold is achieved only when M 2 is greater than 15. However, when the 3rd order is set at 1 like the red line shown, M 2 can be reduced to 12. Further increase M 3 and N 3 makes little improvements. Taking into account the complexity, M 3 and N 3 will be set as 1 in the following evaluations. Then we evaluated the requirement on 1st order FFE, i.e., M 1 . Fig. 6(b) shows the BER performance versus M 1 and M 2 , where M 3 = 1 and N = [5, 5, 1] . Experimental results show that varying M 1 from 15 to 45 does not make much improvement. Therefore, the number of FFE 1st order is set to 15 in the following evaluations. Fig. 7(a) depicts the impacts of N 1 and N 2 . Note that the solid line represents the proposed simplified VF equalizer and the dotted line represents the traditional VF equalizer. It can be seen that the simplified VF equalizer makes negligible penalty to the system performance, verifying the feasibility of simplification. The green line represents the results when N 1 = N 2 = 1, corresponding to 3 feedback taps in total, where the FEC threshold cannot be met even when M 2 is increased to 22. Increasing N 1 and N 2 to 3 improves the sensitivity a lot, as the red curve shows. The total feedback tap number is 7 in this case. Further increase N 1 and N 2 beyond 7 makes insignificant improvements. So finally we set N 1 and N 2 .at 7, corresponding to 15 feedback taps. As a result, the FEC limit can be achieved when M = [15, 10, 1] and N = [7, 7, 1] with 86 taps in total. Fig. 7 (b) plots the BER curves under different received power. The sensitivity of 10-Gb/s NRZ signal is also given for reference. By using the simplified 86-tap VF-DFE equalizer, a sensitivity of −14 dBm can be achieved for the 50-km transmitted 50-Gb/s PAM4 signal, which is ∼1-dB lower than the BtB case. The performance of traditional DFE and VF is also depicted for comparison. In order to get the same sensitivity, 289 taps are required for DFE filtering and 296 taps are required for VF equalizer. 
APD Receiver:
To further increase the sensitivity, we changed the PIN receiver to a 10-G APD-ROSA. Fig. 8(a) depicts the frequency response of 10-G APD and the 3-dB bandwidth is only 6GHz, which is ∼2.5 GHz lower than PIN receiver. Due to the extremely limited bandwidth, the ISI is more severe than the PIN-detection case as inset (i) and (ii) depicts. The APD has a sensitivity of −28 dBm for 10-Gb/s NRZ signal. The same VF-DFE equalizer with M = [15, 10, 1] and N = [7, 7, 1] is employed for the APD detected signal. It can be seen in the figure that the 50-Gb/s PAM4 signal has a BtB sensitivity of approximately −21 dBm. After 50 km SSMF transmission, the sensitivity is reduced to −17dbm, which is 3-dB higher than the PIN-detection case. To achieve similar performance, 300-tap DFE or 351-tap VF is required. Table 1 lists the computational complexity of different equalizers in terms of the multiplications and additions. M and N are the numbers of FFE and DFE taps. M 1 , M 2 , M 3 /N 1 , N 2 , N 3 represent the memory length of VF equalizer of FFE/DFE sections. The calculation of the multiplications and additions are also given in the table. It can be seen from this table that the complexity can be reduced by adding DFE section to the VF filter. Moreover, by abandoning the insignificant taps, the computing complexity can be further reduced. ∼50% multiplications can be saved by using the proposed simplified VF-DFE filter compared with 1-order DFE filter. Compared with FFE-based VF-filtering, the multiplications can be saved by ∼70%.
The Computational Complexity Comparison of Equalizers
For further algorithm simplification, the sensitivity could be sacrificed according to the practical demands. As illustrated above, the computing complexity mainly comes from the 2nd order of FFE. Fig. 9(a) shows the receiver sensitivity versus memory length of 2nd order for the 50-km transmission case. When M 2 is decreased from 10 to 6, the sensitivity is reduced from −14 dBm to −10.5 dBm when PIN is used as receiver. Similar results are obtained for APD-receiver case. A sensitivity of −14.5 dBm is achieved when M 2 is decreased to 6, which is ∼2.5 dB lower than M 2 = 10 case. As for the complexity, by shortening M 2 from 10 to 6, the required taps can be reduced from 86 to 52, and the multiplications can be reduced from 152 to 84. For the real-time implementation, the power consumption and latency are critical parameters which need to be considered [23] . In our proposed VF-DFE algorithm, as much as 86 taps are required although the equalizer has been greatly simplified. Therefore, there are still great challenges for real-time implementation, especially for the feedback sections. In the future research, we will make further efforts to simplify the algorithm by reducing feedback taps or converting feedback into feed forward process to facilitate the real-time implementation of the equalizer. 
Conclusion
In this paper, we proposed a simplified 3-order Volterra-DFE algorithm. The equalization performance is demonstrated by recovering 50-km transmitted 50-Gb/s band-limited PAM4 signal. With 107-tap equalization, −14-dbm sensitivity is achieved using transceivers with a 3-dB bandwidth of 8.5 GHz. Furthermore, by abandoning the insignificant taps, the complexity of the algorithm can be further reduced by ∼20%, where only 86 taps are required. Compared with traditional 1-order DFE and FFE-based VF filtering, the computing complexity can be reduced by ∼70% without introducing sensitivity penalty. In addition, by replacing the PIN with a 10-G APD-ROSA, the sensitivity can be increased by 5.5-dB and 3-dB with the same computational complexity for BtB and 50-km SSMF transmission cases, respectively.
