Abstract-We present a modeling framework for high-speed coupled channels, which allows for the simulation of millions of bits in few seconds. The modeling approach extends the standard IBIS-AMI by including common-mode signals. Further, an expansion of the transient responses at both driver and receiver ports into hierarchical basis functions allows to easily represent long-term memory effects due to the possibly slow dynamics of pre-emphasis blocks. Numerical experiments demonstrate the high accuracy and efficiency of the proposed technique.
I. INTRODUCTION
The objective of this work is to present a novel highspeed serial link simulation method, whose objective is the evaluation of the transient waveforms at the receiver input due to switching signals comprising millions of bits, in just few seconds. This task is nowadays considered as a commodity in modern commercial channel simulation tools adopting the IBIS-AMI standard. This work wants to overcome some of the limitations of the latter approach.
First, we treat the general case of multiport transceivers connected by coupled channels, allowing for a seamless modeling and simulation of both differential and common-mode waveforms. Second, we propose a novel expansion of the transient responses in terms of a hierarchical set of multilevel transient basis functions. This expansion is motivated by the special form that the pre-emphasis circuitry embedded in the driver induces in the transient responses, which may present long-term memory effects due to the slow dynamics of the corresponding digital filter blocks. These effects may induce a different shape in the elementary waveforms accounting for elementary switching, depending on how many successive switching events precede each transmitted bit. This phenomenon may be hard to model within the IBIS-AMI framework, which is based on weighted precursors and postcursors that are just rescaled versions of the same elementary pulse. The proposed signal representation overcomes this limitation through successive hierarchical refinements that converge to the reference responses.
The proposed formulation is based on linearity assumptions, so that superposition holds. This enables frequency-domain (FFT) or time-domain (recursive convolution) approaches to compute transient responses. We adopt the former frequencydomain approach, which naturally plugs into the existing IBIS-AMI framework by suitably extending its scope and applicability.
II. MODELING MULTIPORT TRANSCEIVERS
For the sake of illustration and without loss of generality, the following discussion is entirely based on differential drivers, since receivers can be seen as a simpler particular case. With reference to Fig. 1 , we propose the following model structure Fig. 1 ) as a time-varying analog waveform. The above structure can be seen as a particular case of both state-of-the-art two-piece modeling formats known as IBIS and Mpilog [1] , [4] . In fact, the latter Mpilog structure reads
where with ∈ { , } are nonlinear dynamical multivariate relations accounting for the static and the dynamic behavior of the driver in fixed logic states, and are time varying switching functions. A similar structure holds for IBIS [1] . The proposed model structure (1) is easily derived from (2) under the following assumptions:
• symmetrical switching weights, ( ) = 1 − ( ); • submodels approximated as LTI blocks including both a linear multivariate static characteristic (defined by conductance matrices and static bias current vectors ), and a dynamic LTI submodel which approximates the device behavior in each fixed logic state;
• the driver is symmetric in both its static part = = and its dynamic part = = .
The various parameters of the model are estimated from a set of Transistor-Level (TL) simulations [4] . First, the static multivariate characteristics are extracted by a set of double DC sweeps, and the best linear approximations are obtained by a least-squares fit obtaining and . Second, the dynamic submodels are obtained through Time-Domain Vector Fitting [5] applied to port waveforms obtained through TL transient simulations. Finally, the switching sources ( ) are extracted from the computed current and voltage responses of the driver switching on a given load. The switching term ( ) plays in our framework the same role of the weighting functions ( ) in standard IBIS/Mpilog models, providing in particular an effective behavioral representation of the driver pre-emphasis blocks.
III. MODELING LONG-TERM MEMORY EFFECTS OF PRE-EMPHASIS BLOCKS
The main motivation for the proposed approach is best appreciated from Fig. 2 , where the switching patterns of a commercial 40-nm low-power driver (available as a transistorlevel encrypted netlist), with and without its (1-tap) preemphasis activated. As expected, pre-emphasis boosts switching events. It can be noted that the maximum peak-to-peak amplitude of the waveforms is obtained after a few successive switchings, with a slow dynamic saturation effect. This effect makes each individual switching front different from each other, based on the number of preceding consecutive switching events. Therefore, standard translation-invariant approaches that construct transient waveforms and eye diagrams through superposition of the same elementary pulse centered at multiple UIs do not seem to be adequate, since the shape of the switching fronts is not translation-invariant. Our solution to accurately represent switching fronts that are bit-pattern-dependent involves expansion of the driver source terms ( ) = ( ) − (0) as the following sparse hierarchical superposition
where = 1, 2 are the components of the source vector in (1), is the bit time (UI), , denote, respectively, '01' and '10' transitions, and is the maximum number of hierarchical levels (usually 3-4 levels are sufficient). At each level ℓ, the index sets Ω
(ℓ)
, locate the switching events of type = { , } that are immediately followed by at least ℓ consecutive switchings. The corresponding basis functions (ℓ) , ( ) at each level ℓ characterize the incremental correction that must be applied to the waveform accounting for all lower levels up to ℓ − 1 in order to account for the difference in switching behavior due to the presence of the additional consecutive ℓ-th switching. The amplitude of such basis functions decreases with ℓ, as depicted in Fig. 3 .
For any realistic switching pattern, the size of the index sets Ω
, decreases when increasing ℓ, since the probability of occurrence of consecutive ℓ switching events decreases. Note that these index sets are determined by a digital preprocessing of the logic bit sequence and they are exact. The basis functions , ( ). The procedure is then iterated until all required levels have been extracted.
IV. COMPLETE CHANNEL SIMULATION AND RESULTS
Consider now the simulation of a coupled channel terminated by differential (multiport) driver and receiver modeled , ( ), which in turn are used to express the received voltages as
This expression is identical to (4) but uses different (known) basis functions. In particular, it inherits sparsity due to the hierarchical multilevel expansion. Figure 4 shows the received voltages at the end of a coupled lossy channel driven by the driver of Fig. 2 with pre-emphasis enabled, obtained by successive superposition of increasing levels. It is noted that when only level ℓ = 0 is included, only isolated switching events are correctly represented. Including also level ℓ = 1 provides an accurate representation of any pair of bits that are consecutively switching. Adding more levels leads to convergence for all possible switching sequences.
The received waveforms (5) are readily converted into eye diagrams. Thanks to linearity, common approaches for inclusion of deterministic and random jitter, as well as crosstalk, can be used in a post-processing phase, as in standard IBIS-AMI flows. An example (including jtter and crosstalk) is reported in Fig. 5 . This example was obtained by processing a PRBS-31 pattern of one million bits. Our prototypal nonoptimized MATLAB implementation returned this result in 29 seconds using a standard laptop (Intel Core i7-7500U CPU @ 2.70 GHz, 16.0 GB RAM).
The improvements that our approach may provide with respect to IBIS-AMI standard models are documented in Fig. 6 , which compares the received voltage computed using the true TL model to the proposed solution and to the solution obtained by a standard IBIS-AMI model (for which there is no common-mode voltage prediction).
