    PAM4 Transmitter and Receiver Equalizers Optimization for High-Speed Serial Links

    As the telecommunications markets evolves, the demand of faster data transfers and processing continue to increase. In order to confront this demand, the peripheral component interconnect express (PCIe) has been increasing the data rates from PCIe Gen 1(4 Gb/s) to PCIe Gen 5(32 Gb/s). This evolution has brought new challenges due to the high-speed interconnections effects which can cause data loss and intersymbol interference. Under these conditions the traditional non return to zero modulation (NRZ) scheme became a bottle neck due to bandwidth limitations in the high-speed interconnects. The pulse amplitude modulation 4-level (PAM4) scheme is been implemented in next generation of PCIe (PCIe6) doubling the data rate without increasing the channel bandwidth. However, while PAM4 solve the bandwidth problem it also brings new challenges in post silicon equalization. Tuning the transmitter (Tx) and receiver (Rx) across different interconnect channels can be a very time-consuming task due to multiple equalizers implemented in the serializer/deserializer (SerDes). Typical current industrial practices for SerDes equalizers tuning require massive lab measurements, since they are based on exhaustive enumeration methods, making the equalization process too lengthy and practically prohibitive under current silicon time-to-market commitments. In this master’s dissertation a numerical method is proposed to optimize the transmitter and receiver equalizers of a PCIe6 link. The experimental results, tested in a MATLAB simulation environment, demonstrate the effectiveness of the proposed approach by delivering optimal PAM4 eye diagrams margins while significantly reducing the jitter.ITESO, A.C

    Experimental Evaluation and Comparison of Time-Multiplexed Multi-FPGA Routing Architectures

    Emulating large complex designs require multi-FPGA systems (MFS). However, inter-FPGA communication is confronted by the challenge of lack of interconnect capacity due to limited number of FPGA input/output (I/O) pins. Serializing parallel signals onto a single trace effectively addresses the limited I/O pin obstacle. Besides the multiplexing scheme and multiplexing ratio (number of inter-FPGA signals per trace), the choice of the MFS routing architecture also affect the critical path latency. The routing architecture of an MFS is the interconnection pattern of FPGAs, fixed wires and/or programmable interconnect chips. Performance of existing MFS routing architectures is also limited by off-chip interface selection. In this dissertation we proposed novel 2D and 3D latency-optimized time-multiplexed MFS routing architectures. We used rigorous experimental approach and real sequential benchmark circuits to evaluate and compare the proposed and existing MFS routing architectures. This research provides a new insight into the encouraging effects of using off-chip optical interface and three dimensional MFS routing architectures. The vertical stacking results in shorter off-chip links improving the overall system frequency with the additional advantage of smaller footprint area. The proposed 3D architectures employed serialized interconnect between intra-plane and inter-plane FPGAs to address the pin limitation problem. Additionally, all off-chip links are replaced by optical fibers that exhibited latency improvement and resulted in faster MFS. Results indicated that exploiting third dimension provided latency and area improvements as compared to 2D MFS. We also proposed latency-optimized planar 2D MFS architectures in which electrical interconnections are replaced by optical interface in same spatial distribution. Performance evaluation and comparison showed that the proposed architectures have reduced critical path delay and system frequency improvement as compared to conventional MFS. We also experimentally evaluated and compared the system performance of three inter-FPGA communication schemes i.e. Logic Multiplexing, SERDES and MGT in conjunction with two routing architectures i.e. Completely Connected Graph (CCG) and TORUS. Experimental results showed that SERDES attained maximum frequency than the other two schemes. However, for very high multiplexing ratios, the performance of SERDES & MGT became comparable

    WDM/TDM PON bidirectional networks single-fiber/wavelength RSOA-based ONUs layer 1/2 optimization

    This Thesis proposes the design and the optimization of a hybrid WDM/TDM PON at the L1 (PHY) and L2 (MAC) layers, in terms of minimum deployment cost and enhanced performance for Greenfield NGPON. The particular case of RSOA-based ONUs and ODN using a single-fibre/single-wavelength is deeply analysed. In this WDM/TDM PON relevant parameters are optimized. Special attention has been given at the main noise impairment in this type of networks: the Rayleigh Backscattering effect, which cannot be prevented. To understand its behaviour and mitigate its effects, a novel mathematical model for the Rayleigh Backscattering in burst mode transmission is presented for the first time, and it has been used to optimize the WDM/TDM RSOA based PON. Also, a cost-effective, simple design SCM WDM/TDM PON with rSOA-based ONU, was optimized and implemented. This prototype was successfully tested showing high performance, robustness, versatility and reliability. So, the system is able to give coverage up to 1280 users at 2.5 Gb/s / 1.25 Gb/s downstream/upstream, over 20 Km, and being compatible with the GPON ITU-T recommendation. This precedent has enabled the SARDANA network to extend the design, architecture and capabilities of a WDM/TDM PON for a long reach metro-access network (100 km). A proposal for an agile Transmission Convergence sub-layer is presented as another relevant contribution of this work. It is based on the optimization of the standards GPON and XG-PON (for compatibility), but applied to a long reach metro-access TDM/WDM PON rSOA-based network with higher client count. Finally, a proposal of physical implementation for the SARDANA layer 2 and possible configurations for SARDANA internetworking, with the metro network and core transport network, are presented

    SatCat5: A Low-Power, Mixed-Media Ethernet Network for Smallsats

    In any satellite, internal bus and payload systems must exchange a variety of command, control, telemetry, and mission-data. In too many cases, the resulting network is an ad-hoc proliferation of complex, dissimilar protocols with incomplete system-to-system connectivity. While standards like CAN, MIL-STD-1553, and SpaceWire mitigate this problem, none can simultaneously solve the need for high throughput and low power consumption. We present a new solution that uses Ethernet framing and addressing to unify a mixed-media network. Low-speed nodes (0.1-10 Mbps) use simple interfaces such as SPI and UART to communicate with extremely low power and minimal complexity. High-speed nodes use so-called “media-independent” interfaces such as RMII, RGMII, and SGMII to communicate at rates up to 1000 Mbps and enable connection to traditional COTS network equipment. All are interconnected into a single smallsat-area-network using a Layer-2 network switch, with mixed-media support for all these interfaces on a single network. The result is fast, easy, and flexible communication between any two subsystems. SatCat5 is presented as a free and open-source reference implementation of this mixed-media network switch, with power consumption of 0.2-0.7W depending on network activity. Further discussion includes example protocols that can be used on such networks, leveraging IPv4 when suitable but also enabling full-featured communication without the need for a complex protocol stack

    Machine learning approach for high speed link modeling and IBIS-AMI model generation

    The high-speed link system is one of the major components in the networking infrastructure. Developing a high-performance behavioral model for such a system is crucial but challenging, especially when taking nonlinearity into account. This work reports modeling the high-speed link (HSL) system using machine learning and implementing the model into IBIS-AMI, an industrial standard for SerDes simulation and verification. We started with developing a Volterra series model by extracting the Volterra kernels using feed forward neural networks. We proposed a monomial power series neural network (MPSNN) which can extract Volterra kernels that relate to nonlinearity up to the third order. We developed an analytical mapping from neural network weights to Volterra kernels. The analytical mapping allows accurate time domain signal reconstruction with extracted Volterra kernels. We applied the MPSNN to model pulse amplitude modulation 4 level (PAM-4) and non-return-to-zero (NRZ) system. Volterra kernels up to the third order can be accurately identified. The curse of dimensionality associated with Volterra series impedes the practical applications of the Volterra series. The number of Volterra kernels increases exponentially with the increase in memory length and the nonlinearity order. The large number of Volterra kernels consume a vast amount of computational power during signal reconstruction. To address this challenge, we proposed a Laguerre-Volterra feed forward neural network (LVFFN). The input time-series signal is orthogonalized, in other words, Laguerre-expanded, before it is feed to the neural network. The dimension of the input signal is significantly reduced, which results in many fewer neurons in the hidden layer. We modeled the PAM-4 and NRZ system with LVFFN. The resulted model has the number of parameters that are up to six orders of magnitudes less than the Volterra series. We could also model just the receiver instead of the whole system to add more flexibility to the model in practical applications. The LVFFN model greatly addressed the curse of dimensionality associated with Volterra series. Then the next question addresses how are we going to use it. Since the machine learning based model is not a standardized model, it is difficult to be co-simulated with models generated by other approaches. To circumvent the challenges in model transportability and interoperability, we implemented the LVFFN model into the IBIS-AMI model, an industrial standard model that is compatible with most of the circuit simulators. We could simulate the LVFFN IBIS-AMI model in Keysight ADS and conduct the eye-diagram analysis. IBIS-AMI model generation is not trivial. It requires cross-disciplinary knowledge in signal integrity, HSL circuit, and software engineering. To facilitate the process of IBIS-AMI model generation, we developed a software, ezAMI, that can generate the IBIS-AMI model by clicks. The software is developed using Qt/C++ and is an open-source software. The software architecture and tutorial are introduced in this dissertation as well

    Transmitter and receiver equalizers optimization for PCI Express Gen6.0 based on PAM4

    The continuously increasing bandwidth demand from new applications has led to the development of the new PCIe Gen6, reaching data rates of 64 GT/s and adopting PAM4 modulation scheme. While PAM4 solves the bandwidth constraint in high-speed interconnects, it brings new challenges for the physical channel analysis. Equalization (EQ) plays an important role even with PAM4 signaling. PCIe specification defines requirements to perform EQ at the transmitter (Tx) and at the receiver (Rx). During the EQ process, one combination of Tx/Rx EQ coefficients must be selected to meet the performance requirements of the system. Testing all possible coefficient combinations is prohibitive. Current industrial practice consists of finding a subset of combinations at post-silicon validation using maps of EQ coefficients. Finding this subset of coefficients is timeconsuming,along with all the new challenges imposed by PAM4. In this paper, we propose an optimization approach for PCIe Gen6 link EQ. Our proposal is based on a suitable objective function formulated over the channel operating margin (COM), which is a new figure of merit (FOM) adopted by standards of communications for signaling speeds beyond 25 Gbps.ITESO, A.C

    Sistemas de calibração automático para transceivers NG-PON2

    The current society is increasingly dependent on communication services, requiring better and faster connections, predicting in a near future connections in the order of hundreds of Gbit/s. During the data transmissions, the increase of speed reflects an increase of the error ratio due to factors such as noise, reductions of signal or jitter, which for low speed these were not emphasized so much. This project involves the development of a BER test system for both continuous and Burst mode of the transmission, demonstrating the viability of communication over the next-generation technology, NG-PON2, which uses high transmission rates (10 Gbit/s). For this purpose, an FPGA architecture was implemented that allows for long distances in the optical network, high transmission rates. This choice reflects a more economical alternative in relation to commercial equipment and has several advantages, such as the flexibility to reprogram and prepare the architecture according to the needs of the user. To achieve the proposed requirements, the project was divided into three parts. In the first part an architecture was developed that allows to obtain the error rate during a continuous mode transmission. In order to obtain the real-time viability of the communication referred and to have control over the system, an interface was developed between the computer and the FPGA to change certain characteristics of the communication channel. This is the second part of the project. The last part of the project has an architecture similar to the previous one, that is, instead of the transmission to be done in continuous mode, it is performed in mode Burst, being this the requirement with more interest to the technology NG-PON2. Finally, proof of concept was performed through an optical network provided by the company PICadvanced that allowed the validation of the different parts of the project. These validations will allow the development of new modules that will later contribute to the main project that is under development in the company PICadvanced, which aims at the construction of an automatic calibration board for the XFP transceivers.A sociedade atual depende cada vez mais dos serviços de comunicação, exigindo melhores ligações e mais rápidas, prevendo-se num futuro próximo a necessidade de ligações na ordem das centenas de Gbit/s. O aumento dos ritmos de transmissão refletem um aumento no que se refere à taxa de erro (BER), uma vez que o impacto associado a fatores como ruı́do ou interferência entre sı́mbolos, é maior do que para baixos ritmos. Este trabalho foca-se no desenvolvimento de um sistema de teste BER, tanto para uma transmissão contı́nua como para transmissão em rajadas, que demonstre a viabilidade da comunicação sobre a tecnologia da próxima geração, Next Generation Passive Optical Network 2 (NG-PON2), que utiliza débitos de transmissão elevados (10 Gbit/s). Para este efeito foi implementado uma arquitetura em Field-programmable gate array (FPGA) que possibilita para longas distâncias na rede ótica, elevados ritmos de transmissão. Esta escolha reflete uma alterativa mais económica em relação aos equipamentos comerciais e apresenta vantagens tais como a flexibilidade de reprogramar e preparar a arquitetura de acordo com as necessidades do utilizador. Para cumprir os requisitos propostos o projeto dividiu-se em três partes. Numa primeira parte do projeto desenvolveu-se uma arquitetura que permite adquirir a taxa de erros durante uma transmissão contı́nua. Com o intuito de analisar a viabilidade em tempo real da comunicação em questão, bem com o utilizador ter controlo sobre o sistema, alterando certas caracterı́sticas do canal de comunicação, desenvolveu-se numa segunda parte do projeto uma interface entre o computador e a FPGA. Numa última parte do projeto desenvolveu-se uma arquitetura semelhante à anterior, na qual se permite igualmente adquirir a taxa de erros com transmissão em rajadas (Burst), sendo este um dos requisitos de maior interesse na tecnologia NG-PON2. Por fim, a prova de conceito foi realizada através de uma rede ótica disponibilizada pela empresa PICadvanced, que permitiu a validação das diversas partes do projeto. Estas validações vão permitir a conceção de novos módulos que posteriormente vão contribuir para o projeto fonte que está em desenvolvimento na empresa PICadvanced, que visa a implementação de uma placa de calibração automatizada para os transceptores 10 Gigabit Small Form Factor Pluggables (XFP).Mestrado em Engenharia Eletrónica e Telecomunicaçõe

    An In-Depth Analysis of the Slingshot Interconnect

    The interconnect is one of the most critical components in large scale computing systems, and its impact on the performance of applications is going to increase with the system size. In this paper, we will describe Slingshot, an interconnection network for large scale computing systems. Slingshot is based on high-radix switches, which allow building exascale and hyperscale datacenters networks with at most three switch-to-switch hops. Moreover, Slingshot provides efficient adaptive routing and congestion control algorithms, and highly tunable traffic classes. Slingshot uses an optimized Ethernet protocol, which allows it to be interoperable with standard Ethernet devices while providing high performance to HPC applications. We analyze the extent to which Slingshot provides these features, evaluating it on microbenchmarks and on several applications from the datacenter and AI worlds, as well as on HPC applications. We find that applications running on Slingshot are less affected by congestion compared to previous generation networks.Comment: To be published in Proceedings of The International Conference for High Performance Computing Networking, Storage, and Analysis (SC '20) (2020