Design and Simulation of a 12 Gb/s Transceiver With 8-Tap FFE, Offset-Compensated Samplers and Fully Adaptive 1-Tap Speculative/3-Tap DFE and Sampling Phase for MIPI A-PHY Applications by Menin, Davide et al.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS 1
Design and Simulation of a 12 Gb/s Transceiver
with 8-Tap FFE, Offset-Compensated Samplers and
Fully-Adaptive 1-Tap Speculative/3-Tap DFE and
Sampling Phase for MIPI A-PHY Applications
Davide Menin , Andrea Bandiziol , Werner Grollitsch, Member, IEEE, Roberto Nonis, Member, IEEE and
Pierpaolo Palestri , Senior Member, IEEE
Abstract—This paper presents a fully-adaptive high-speed
serial interface designed in 28 nm planar CMOS technology for
future MIPI-compliant automotive microcontrollers operating
at 12 Gb/s over long-reach channels. The transmitter has a
voltage-mode driver and operates at full rate featuring an
8-tap feed-forward equalizer with tap programmability of 1/16.
Transmitter’s output impedance tuning is performed through
activation of different driver replicas. The half-rate receiver
features an analog front-end which comprises a variable-gain
amplifier and a continuous-time linear equalizer. The subsequent
decision-feedback equalizer has 3 programmable taps, the first
of which is loop-unrolled to relax timing constraints. Another
amplifier is embedded in the DFE’s summing node. We employ
transistor-level simulations to assess the capability of the interface
to optimally adapt to realistic channels: The DFE taps and the
data sampling phase are automatically adapted by means of
a behavioural implementation of an LMS algorithm based on
information gathered through error sampling. Such an interface
was simulated on channels representing likely MIPI A-PHY
to-be-defined specifications featuring up to 33 dB loss at 6 GHz.
Index Terms—High-Speed Serial Interfaces, Automotive, Inter-
Symbol Interference, LMS Adaptive Equalization, MIPI A-PHY
I. INTRODUCTION
High-speed serial interfaces (HSSI) for chip-to-chip com-
munications are becoming of widespread use in servers,
portable devices and many other consumer applications due
to the ever-increasing need for high data rates and low
energy-per-bit [1]–[3]. In this respect, automotive systems
lag behind consumer applications in terms of bit rate, but
demand for high-speed interfaces is expected to grow with
the advent of Advanced Driving-Assistance Systems [4]–[6],
hence requiring complex design techniques to be brought into
the automotive environment. Data rates faster than 10 Gb/s are
expected to become common in the near future and capable of
coping with channels featuring significant attenuation [6], [7].
The resulting inter-symbol interference (ISI) requires aggres-
sive equalization strategies [1], [2], [8], [9]. Due to the change
in the channel characteristics over PVT (process, voltage and
D. Menin and P. Palestri are with the Polytechnic Department of Engineer-
ing and Architecture (DPIA), University of Udine, 33100 Udine, Italy (e-mail:
menin.davide@spes.uniud.it).
A. Bandiziol, W. Grollitsch and R. Nonis are with Infineon Technologies,
9500 Villach, Austria
temperature) variations and to allow the transceiver to operate
on largely different channels, equalization parameters need to
be tuned by fully-adaptive algorithms [9].
In this paper, through accurate probabilistic models as
well as behavioural Verilog-A system-level modelling [10]
coupled to post-layout transistor-level simulations, we present
the capability of the transceiver in [11] to operate at the
increased speed of 12 Gb/s (w.r.t. 9.2 Gb/s of the past imple-
mentation) with a channel complying with typical specifica-
tions of yet-to-be-defined MIPI A-PHY standard featuring an
attenuation of 33 dB at Nyquist frequency [6].
The paper proceeds as follows: section II describes the
architecture of the transceiver, the upgrade from the past
realization [5], [11], [12] and the tests demonstrating operation
over the automotive PVT range; section III describes the
implementation of the fully-adaptive algorithm; transistor-level
simulations are shown in section IV for realistic automotive
channels, with considerations on the transceiver’s equalization
strategy; conclusions are drawn in section V.
II. TRANSCEIVER ARCHITECTURE
The block diagram of the transceiver is sketched in Fig. 1.
In this paper we focus on the adaptive equalization of the
channel. Details on the CDR algorithm are reported in [12].
The transmitter [5] operates at full rate and performs
feed-forward equalization (FFE) with 1 pre-, 1 main- and 6
post-cursors, all programmable in steps of 1/16 (≈ 1.16 dB of
variation). A low-dropout regulator (LDO) is used to set the
output voltage swing. Different replicas of the driver can be
connected in parallel to set the transmitter’s output impedance.
The receiver [12] operates at half rate with a 1-tap spec-
ulative decision-feedback equalizer (DFE) featuring other 2
non-speculative taps, all programmable with steps of 7 mV.
A programmable variable gain amplifier (VGA) at the input
adapts the voltage swing for the subsequent continuous-time
linear equalizer (CTLE). In addition to the architecture pre-
sented in [12], Fig. 1 shows two error samplers with variable
threshold dLev (other two are in the odd path) used to perform
full adaptation as described in the following section.
The circuit was designed in 28 nm planar CMOS technology
and experimental data proving operation up to 9.2 Gb/s was
This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publication.
The final version of record is available at  http://dx.doi.org/10.1109/TCSII.2019.2926152
Copyright (c) 2019 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.






































Fig. 1. General architecture of the transceiver (lines are drawn as single-ended although signalling is differential). On the left, the transmitter is shown with
FFE; the transmission channel is followed by an analog front-end comprising a VGA and a CTLE stage; the subsequent even portion of the half-rate receiver
(the odd portion is omitted for simplicity) on the right implements a 3-tap speculative DFE; edge and error sampling is shown on the bottom right. The CDR
provides the edge clock, while the adaptive block (not shown) computes the DFE taps, dLev and the skew between the edge and data clocks.
reported in [11]. Parts of the circuit have been re-designed to
increase the speed up to 13 Gb/s: Most notably, now the re-
ceiver’s CTLE and VGAs are capable of compensating 7.5 dB
and 6 dB, respectively, and one of the DFE taps has been
detached in order to implement offset compensation at the
samplers at start-up time; other components were optimized
for such a higher speed. As a consequence, the transceiver’s
figure of merit reported in [11] has increased to 6.8 mW/Gb/s
w.r.t. 5.7 mW/Gb/s in Table I therein. Simulations (see Table I
here) demonstrate operation of the transmitter up to 12 Gb/s
over the whole automotive range of temperature (−40 ◦C to
150 ◦C), supply (−10 % to +10 %) and technology corners
(only slow-slow and fast-fast results are reported for brevity).
We simulated the whole transceiver in loopback mode and
found no errors in the received ∼ 215 bits over the whole
automotive PVT range. Longer mixed-signal simulations on
XA-VCS showed no errors.
III. ADAPTATION ALGORITHM
Implementation of fully-adaptive algorithms requires addi-
tional hardware with respect to the receiver described in [12].
We added the error samplers shown in Fig. 1 that compare the
equalized received analog data with the data level dLev [13].
Error and data samples are deserialized 1:40 and then input to
the LMS algorithm [9]. The LMS loop for dLev is updated
based on the correlation between data and error samples:






where di and ei are the i-th received data and edge samples,
respectively, ∆dLev = 60 mV, N = 80 bit sequence length.










SIMULATED EYE HEIGHT (EH) AND WIDTH (EW) FOR AUTOMOTIVE
CORNERS AT 12 Gb/s. SUPPLY PARASITICS: R = 1 Ω AND L = 1 nH;
GROUND PARASITICS: R = 200 mΩ AND L = 200 pH.
Corners Figures of merit
Voltage [%] Temperature [◦C] Technology EH [mV] EW [ps]
10 150 FF 596 75
10 150 SS 487 74
10 −40 FF 626 76
10 −40 SS 505 81
−10 150 FF 556 76
−10 150 SS 436 73
−10 −40 FF 549 75
−10 −40 SS 438 73
The optimal sampling point is found through another
LMS loop that modifies the data-sampling phase w.r.t. the
edge-sampling clock determined by the CDR [9]:






where φ is the shift w.r.t. the CDR edge clock, ∆φ = 27◦.
The FFE and CTLE parameters are implemented as
pre-settings, as adaptation of the former would require a
back-channel [13], while our implementation of the latter has
a quite coarse graining that is not well suited for adaptation.
At this stage of system development, the error samplers and
the DAC producing dLev are implemented in a behavioural
Verilog-A block to test the performance of the algorithm.
IV. RESULTS
In this section we simulated the performance of the
transceiver and of the fully-adaptive algorithm considering a
channel likely compliant with future definition of the MIPI
This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publication.
The final version of record is available at  http://dx.doi.org/10.1109/TCSII.2019.2926152
Copyright (c) 2019 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.






























Fig. 2. (a) Insertion loss and (b) pulse responses of a channel targeted by MIPI A-PHY applications having 33 dB loss at 6 GHz. The solid line is obtained
with the full microstrip model, the dashed line is the response to its rational fit (4). All equalizers and amplifiers are turned off.
















Fig. 3. Pulse response obtained from the fitted MIPI A-PHY channel of
Fig. 2(a) with an FFE pre-set of w−1 = −4/16, w0 = 9/16, w1 = −3/16,
and with the CTLE and the two VGAs amplifying respectively 4 dB and
5.5 dB at Nyquist frequency. The solid line is the anti-transform of the transfer
function in Eq. (4) convolved with the transmitted pulse, while the dashed line
is plotted from transistor-level simulations.
A-PHY standard. Since S parameters of such a channel will
be available only at the end of 2019, we worked with a
well-established procedure [10] and translated the projected
loss of such a channel in a microstrip structure losing roughly
33 dB at Nyquist frequency. The channel is modelled numeri-
cally in the following way. Firstly, the characteristic impedance
and effective dielectric constant are related to the microstrip
geometry according to [14]. Secondly, RLCG parameters are
extracted and combined with the driver and receiver impedance
in order to compute the channel’s transfer function [15]. We
selected a geometry giving a single-ended impedance of 50 Ω
up to 12 GHz. We then tweaked the microstrip’s parameters
in order to obtain an attenuation of 33 dB at 6 GHz, which is
expected to be the typical loss MIPI A-PHY and IEEE 802.3
are currently targeting [7], [16], [17]. The resulting gain as a
function of frequency is reported in Fig. 2(a).
The channel is included in the circuit simulations as a









































Fig. 4. Schematic of the virtual eye monitor that allows to visualize the effect
of the full DFE, despite the first tap being speculative. All multiplexers and
delays are behavioural blocks of analog circuits.
the red dashed line in Fig. 2(a). In Fig. 2(b), the resulting
pulse response is compared to the one produced by Fourier
anti-trasformation of the complete transfer function convolved
with the transmitted pulse using the method presented in [18].
The channel features many post-cursors and two large
pre-cursors. The peak is significantly attenuated to approxi-
mately 50 mV, while the transmitted differential swing was
0.6 V. The corresponding simulated eye diagram (not shown)
is completely closed and corresponds to a very high BER.
In such a situation, DFE alone cannot compensate ISI since
its feedback register always contains wrong bit decisions. We
thus adjusted the system’s pre-set (VGAs, CTLE and FFE)
to improve the reception. Those need not be optimal settings,
and may be determined in a calibration procedure prior to
the beginning of transmission. In fact, it will be shown later
on that the same pre-settings can be suited also as starting
point for different channels, thus demonstrating that they do
not need to be chosen precisely. The resulting pulse response
is reported in Fig. 3: The cursors are greatly reduced, although
the eye diagram is still closed, despite the action of the analog
front-end (amplifying ≈ 15 dB at Nyquist frequency).
This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publication.
The final version of record is available at  http://dx.doi.org/10.1109/TCSII.2019.2926152
Copyright (c) 2019 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS 4


























Fig. 5. Eye diagram corresponding to the pulse response of Fig. 3 (before
adaptive equalization). On the left, (a) was constructed from the pulse response
by using a newly-developed probabilistic algorithm [15]; on the right, (b) was

































Fig. 6. Equalization quantities adapted by the behavioural block as a function
of time as obtained from transistor-level simulation. The DFE taps and the
phase are computed as digital codes and were converted to physical quantities
for plotting purposes.
Obtaining the eye diagram from the circuit simulation is
not straightforward. The receiver is half rate and the DFE is
speculative: There are 4 analog nodes connected to 4 samplers
(Fig. 1). For plotting purposes, we added behavioural analog
multiplexers to combine these 4 signals and plot a single
eye diagram taking into account the speculative tap (Fig. 4).
Sampling phase































Fig. 7. Eye diagrams after convergence of the LMS equalization loops on
the channel of Fig. 2(a) as obtained through (a) post-processing of the pulse
response with the method in [15] and (b) transistor-level simulations.
For the MIPI A-PHY channel with the pre-set of Fig. 3 we
obtained the eye in Fig. 5(b), which is so closed as in Fig. 5(a)
(obtained from post-processing of the pulse response [15]).
DFE and optimal sampling point adaptation are needed in
order to improve the situation. The convergence of the LMS
loops based on Eqs. (1), (2) and (3) is reported in Fig. 6. dLev
converges in the proximity of 80 mV, roughly corresponding
to the peak in Fig. 3. The optimal sampling point moves
to about 70◦ away from the edge clock phase (compared
to the 90◦ commonly set by the CDR to have the sampling
phase exactly between the transition edges). The 3 DFE taps
converge to 70 mV, 14 mV and 14 mV respectively, similarly
to the post-cursors in Fig. 3 after the sampling point is moved
towards the left (to reduce the first pre-cursor).
The eye diagrams after full adaptation, i.e. obtained by
post-processing the pulse response of Fig. 3 after DFE and
sampling point adaptation and from transistor-level simula-
tions, are reported in Fig. 7: We now observe an open eye
that proves the correctness of the adaptive loop for DFE
and optimal sampling point. This is also supported by the
null bit-error rate in the circuit simulation after a sequence
of 104 bits. We also see that the circuit simulation is in
good agreement with the eye obtained via numerical elabo-
ration of the pulse response [15]. Although the time-domain
transistor-level simulations cannot prove functioning with
BER < 10−12 (as required by MIPI A-PHY and IEEE 802.3),
the post-processing of the pulse response suggests that this
This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publication.
The final version of record is available at  http://dx.doi.org/10.1109/TCSII.2019.2926152
Copyright (c) 2019 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS 5
Sampling phase












Fig. 8. Eye diagram after convergence of the LMS equalization loops on a
channel with 20 dB loss at 6 GHz as obtained from transistor-level simulation.
is the case, provided that the jitter of the clock and CDR is
sufficiently low. To have a rough estimate of the effect of jitter,
we exploit the method in [19] applied to the eye obtained from
post-processing of the pulse response. The jitter is estimated to
be 1.5 ps RMS from the bathtub plots measured at 4.6 Gb/s in
[11] and scaled to 12 Gb/s operation using a simple theoretical
model; the result is a reduction in eye width from ≈ 38 ps to
≈ 19 ps at BER = 10−12.
We now show that the FFE, CTLE and VGAs pre-sets cho-
sen above can support fully-adaptive operations for different
channels. We chose a channel with the same geometry and
physical characteristics as the one in Fig. 2(a), now attenuating
20 dB at Nyquist frequency. The equalised eye diagram is
shown in Fig. 8, demonstrating equalization capabilities over
different channels. Differently from Fig. 7, now the optimal
sampling point is delayed with respect to the center of the eye,
because the chosen FFE pre-set results in a negative pre-cursor,
which is compensated by moving the sampling point to the
right (in the previous case it was moved to the left).
V. CONCLUSIONS
We have devised an accurate modelling procedure based on
probabilistic system-level calculations and behavioural models
coupled to transistor-level simulations to show the capability
of a HSSI operating at 12 Gb/s to properly equalize channels
with a wide range of attenuation from 20 dB to 33 dB at 6 GHz
and to operate over the whole automotive PVT range. We
have proven that a single pre-set for FFE and CTLE can work
with different channels: If the attenuation is low, the pre-set
over-compensates the pre- and post-cursor, but adaptive DFE
and optimal sampling point counteract this effect.
Through efficient probabilistic post-processing of the pulse
response [15] we have also shown how to estimate eye dia-
grams which closely resemble the time-domain transistor-level
simulations, which is useful to estimate the BER.
Overall, such a coupled circuit/behavioural modelling ap-
proach proved to be a powerful tool to determine the perfor-
mance of high-speed serial interfaces and provide guidelines
about equalization and LMS parameters.
ACKNOWLEDGEMENTS
We acknowledge M. Bassi and G. Steffan for helpful
discussions, and A. De Prà for help in the development of
the fully-adaptive algorithm.
REFERENCES
[1] T. C. Carusone, “Introduction to Digital I/O: Constraining I/O Power
Consumption in High-Performance Systems,” IEEE Solid-State Circuits
Mag., vol. 7, no. 4, pp. 14–22, Fall 2015.
[2] J. Lee, K. Park, K. Lee, and D.-K. Jeong, “A 2.44-pJ/b 1.62–10-Gb/s
Receiver for Next Generation Video Interface Equalizing 23-dB Loss
With Adaptive 2-Tap Data DFE and 1-Tap Edge DFE,” IEEE Trans.
Circuits Syst. II, vol. 65, no. 10, pp. 1295–1299, Oct. 2018.
[3] J. Park, J.-H. Chae, Y.-U. Jeong, J.-W. Lee, and S. Kim, “A 2.1-Gb/s
12-Channel Transmitter With Phase Emphasis Embedded Serializer for
55-in UHD Intra-Panel Interface,” IEEE J. Solid-State Circuits, vol. 53,
no. 10, pp. 2878–2888, Oct. 2018.
[4] N. J. Endo, “Wireless Communication In and Around the Car: Status
and Outlook. ES3: High-Speed Communications on 4 Wheels: What’s
in Your next Car?” in 2013 IEEE Int. Solid-State Circuits Conf. Dig. of
Tech. Papers, Feb. 2013, pp. 515–515.
[5] A. Bandiziol, W. Grollitsch, F. Brandonisio, R. Nonis, and P. Palestri,
“Design of a 8-taps, 10gbps transmitter for automotive micro-
controllers,” in 2016 IEEE Asia Pacific Conf. on Circuits and Syst.
(APCCAS), Oct 2016, pp. 321–324.
[6] “IEEE P802.3ch Multi-Gig Automotive Ethernet PHY Task Force,”
http://www.ieee802.org/3/ch/public/mar18/index.html, Mar. 2018.
[7] E. Di Biaso, B. Bergner, and C. Mandel, “High Speed Channel Modeling
and Analysis,” Apr. 2018. [Online]. Available: http://www.ieee802.org/
3/ch/public/adhoc/Bergner DiBiaso Mandel 3ch 01 0418.pdf
[8] J. F. Bulzacchelli, “Equalization for Electrical Links: Current Design
Techniques and Future Directions,” IEEE Solid-State Circuits Mag.,
vol. 7, no. 4, pp. 23–31, Fall 2015.
[9] V. Balan, O. Oluwole, G. Kodani, C. Zhong, R. Dadi, A. Amin,
A. Ragab, and M.-J. E. Lee, “A 15–22 Gbps Serial Link in 28 nm
CMOS With Direct DFE,” IEEE J. Solid-State Circuits, vol. 49, no. 12,
pp. 3104–3115, 2014.
[10] D. Menin, A. De Prà, A. Bandiziol, W. Grollitsch, R. Nonis, and
P. Palestri, “A Simple Simulation Approach for the Estimation of
Convergence and Performance of Fully-Adaptive Equalization in High-
Speed Serial Interfaces,” IEEE Trans. Compon. Packag. Manuf. Technol.,
2019, available online.
[11] A. Bandiziol, W. Grollitsch, G. Steffan, R. Nonis, and P. Palestri,
“Design and Characterization of a 9.2-Gb/s Transceiver for Automotive
Microcontroller Applications With 8-Taps FFE and 1-Tap Unrolled/4-
Taps DFE,” IEEE Trans. Circuits Syst. II, vol. 65, no. 10, pp. 1305–1309,
Oct. 2018.
[12] A. Bandiziol, W. Grollitsch, F. Brandonisio, M. Bassi, R. Nonis, and
P. Palestri, “Design of a Half-Rate Receiver for a 10Gbps Automotive
Serial Interface with 1-Tap-Unrolled 4-Taps DFE and Custom CDR
Algorithm,” in 2018 IEEE Int. Symp. on Circuits and Syst. (ISCAS),
May 2018, pp. 1–5.
[13] V. Stojanović, A. Ho, B. Garlepp, F. Chen, J. Wei, G. Tsang, E. Alon,
R. Kollipara, C. Werner, J. Zerbe, and M. Horowitz, “Autonomous dual-
mode (PAM2/4) serial link transceiver with adaptive equalization and
data recovery,” IEEE J. Solid-State Circuits, vol. 40, no. 4, pp. 1012–
1026, Apr. 2005.
[14] M. Dazzi, P. Palestri, D. Rossi, A. Bandiziol, I. Loi, D. Bellasi, and
L. Benini, “Sub-mW multi-Gbps chip-to-chip communication Links for
Ultra-Low Power IoT end-nodes,” in 2018 IEEE Int. Symp. on Circuits
and Syst. (ISCAS), May 2018, pp. 1–5.
[15] A. Cortiula, M. Dazzi, M. Marcon, D. Menin, A. Bandiziol, A. Cristo-
foli, W. Grollitsch, R. Nonis, and P. Palestri, “A Simple and Fast Tool
for the Modelling of Inter-Symbol Interference and Equalization in
High-Speed Chip-to-Chip Interfaces,” in The 42nd Int. Conv. on Infor-
mation and Communication Technol., Electronics and Microelectronics
(MIPRO), May 2019.
[16] T. Müller, “802.3ch channel measurement results,” Oct.
2017. [Online]. Available: http://www.ieee802.org/3/ch/public/adhoc/
2017-10-04%20802.3ch%20channel%20measurementresults%20.pdf
[17] E. Di Biaso, “Insertion Loss Limit Analysis,” May 2018. [Online].
Available: http://www.ieee802.org/3/ch/public/adhoc/DiBiaso 3ch 01
05-30-18%20-%20adhoc.pdf
[18] T. Brazil, “Causal-convolution – a new method for the transient analysis
of linear systems at microwave frequencies,” IEEE Trans. Microw.
Theory Techn., vol. 43, no. 2, pp. 315–323, Feb. 1995.
[19] A. Sanders, “Statistical Simulation of Physical Transmission Media,”
IEEE Trans. Adv. Packag., vol. 32, no. 2, pp. 260–267, May 2009.
This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publication.
The final version of record is available at  http://dx.doi.org/10.1109/TCSII.2019.2926152
Copyright (c) 2019 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
