Modeling and Integration of Highly Parallel Optical Interconnect in Electronic Systems by De Wilde, M
Modellering en integratie van hoogparallelle optische verbindingen
in elektronische systemen
Modeling and Integration of Highly Parallel Optical Interconnect
in Electronic Systems
Michiel De Wilde
Promotor: prof. dr. ir. J. Van Campenhout
Proefschrift ingediend tot het behalen van de graad van 
Doctor in de Ingenieurswetenschappen: Elektrotechniek
Vakgroep Elektronica en Informatiesystemen
Voorzitter: prof. dr. ir. J. Van Campenhout
Faculteit Ingenieurswetenschappen
Academiejaar 2006 - 2007  
ISBN 978-90-8578-149-3
NUR 959
Wettelijk depot: D/2007/10.500/23
Dankwoord
Acknowledgements
Een heleboel mensen hebben op een of andere wijze bijgedragen tot de
totstandkoming van dit proefschrift. Ik wens hen hier van harte te bedanken
voor hun steun of medewerking. In het bijzonder gaat mijn dank uit naar
Quite a number of people have contributed in one way or another to the creation of
this dissertation. I wish to cordially thank them for their support or participation.
Specific thanks go out to
• mijn promotor prof. Jan Van Campenhout, die mij de ruimte heeft ge-
boden om zowel onderzoek te verrichten als nieuwe vaardigheden te
verwerven. Tot het bespreken van ideeën en het nalezen van teksten was
hij steeds bereid en daarbij ging zijn volle aandacht steeds mee tot in de
kleinste details. Inzake ruisfenomenen, statistiek en correct wetenschap-
pelijk symbool- en taalgebruik moest ik zelfs extra voortmaken of hij
loste de problemen in mijn plaats op.
• mijn ouders, voor de steun en de vele mogelijkheden tot ontplooiing die
ze mij van kleins af aan gegeven hebben, zoals de gelegenheid tot het
volgen van universitaire studies die dit werk mogelijk hebben gemaakt.
• mijn echtgenote Vanessa en onze kindjes Simon (2j 10m) en Jonas (1j 5m)
voor de gezellige thuis en de nodige afleiding. Onze zoontjes staan nog
steeds te popelen om allerlei tekst mee in te typen als thuis de computer
opstaat.
• ir. Olivier Rits, collega-doctorandus bij de onderzoeksgroep Fotonica
van de vakgroep INTEC. Wij hebben op onderzoeksonderdelen die bij
zowel mijn als zijn doctoraatsonderzoek (over een nauwkeurige 3D-
alignatietechniek) aansloten altijd zeer goed kunnen samenwerken. Voor
een aantal bijdragen in dit werk was zijn inbreng cruciaal.
ii Dankwoord
• mijn bureaugenoten dr. ir. Joni Dambre, ir. Wim Meeus, ir. Harald Devos,
ing. Jan De Ceuster en ir. Wim Heirman, voor de aangename werksfeer,
de vrolijke noten en de permanente bereidheid om over om het even
wat te brainstormen. Bij Wim Meeus zal ik blijven denken aan de zeer
vlotte samenwerking bij het ontwerpen, testen/uitmeten en debuggen
van tal van elektronische systemen.
• all people involved in the Interconnect by Optics project who have contributed
to the development of the optical interconnect hardware prototypes that have
been heavily relied on in this work. Special thanks go out to dr. ir. Ronny
Bockstaele, who has managed the project with a lot of dedication.
• prof. Dirk Stroobandt, prof. Koen De Bosschere en prof. Erik D’Hollander,
die zich naast mijn promotor inzetten om de PARIS-onderzoeksgroep
en diens doctorandi in goede banen te leiden.
• de medewerkers verbonden aan de onderzoeksgroep Fotonica van de
vakgroep INTEC, die steeds bereid waren om meetapparatuur en onder-
steuning te verlenen. In het bijzonder wens ik Hendrik Sergeant en prof.
Roel Baets te vermelden naast ir. Olivier Rits.
• ir. Michael Vervaeke en prof. Hugo Thienpont (Vrije Universiteit Brussel,
vakgroep TONA-TW) voor de bereidwillige fabricatie van een optische
stekker via Diepe Lithografie met Protonen.
• prof. Dirk Stroobandt (ELIS-PARIS) en de FWO-projectcollega’s uit de
afdelingen ESAT/COSIC en ESAT/TELEMIC van de Katholieke Univer-
siteit Leuven voor de gegunde tijdsruimte die mij toeliet om dit werk te
voltooien.
• mijn broer ir. Wim De Wilde voor het willen meevolgen van alle wel en
wee en er ook alles van te verstaan.
• ir. Wim Heirman en ir. Frederik Vandeputte voor het verzorgen van
mooie foto’s voor een aantal illustraties.
• prof. Pieter Rombouts en dr. ir. Christof Debaes (Vrije Universiteit Brus-
sel, vakgroep TONA-TW) voor het mee brainstormen rond analoog
chipontwerp.
• prof. Soha Hassoun (Department of Computer Science, School of Engineering,
Tufts University) and all other people involved in the organization of the first
ACM-SIGDA Design Automation Summer School (2001, Cape Cod, MA,
USA).
• de examencommissie voor de geïnvesteerde tijd en opbouwende com-
mentaar.
the examination commission for their time and constructive comments.
Dit werk werd ondersteund door het Fonds voor Wetenschappelijk Onderzoek
– Vlaanderen, het Interconnect by Optics-project (IST-2000-28358) binnen het
vijfde kaderprogramma voor onderzoek en technologische ontwikkeling van
de Europese Gemeenschap, en het Belgian Photon Network (IAP V/18) van het
Interuniversitaire Attractiepolen-programma van de Belgische Staat, Diensten
van de Eerste Minister, Federaal Wetenschapsbeleid.
This work was supported by the Research Foundation – Flanders, the Inter-
connect by Optics project (IST-2000-28358) of the European Community’s fifth
framework program for research and technological development, and the
Belgian Photon Network Interuniversity Attraction Poles program (IAP V/18),
initiated by the Belgian State, Prime Minister’s Services, Science Policy Office.
Acknowledgements iii
iv Dankwoord
Samenvatting
Optische verbindingen binnenin computersystemen? Digitale communi-
catie via elektromagnetische-golfgeleiding wordt voornamelijk gedragen door
elektrische en optische verbindingen. De keuze tussen beide wordt bepaald
door de lengte, de overdrachtscapaciteit en de toepassing van de verbinding.
Vanaf het einde van jaren zeventig werden optische verbindingen commercieel
geïntroduceerd voor kilometerslange telecommunicatieverbindingen wegens
hun voortreffelijke bandbreedte- en vermogenseigenschappen: anders dan
bij elektrische verbindingen zijn verliezen essentieel onafhankelijk van de
overdrachtscapaciteit; ze zijn ook veel kleiner. In de tussentijd is het algemeen
gebruik van optische verbindingen steeds kortere afstanden gaan beslaan
(tegenwoordig tot enkele meters). Dit fenomeen wordt voortgedreven door de
stijgende overdrachtscapaciteitseisen van telecommunicatietoepassingen en de
beschikbaarheid van steeds kleinere betaalbare opto-elektronische onderdelen.
Computersystemen bevatten doorgaans enkel elektrische verbindingen omdat
deze de meest natuurlijke koppeling vormen voor elektronische bouwstenen.
De latentietijd van verbindingen vormt vaak een prestatiebeperkende factor
in dergelijke systemen. De lichtsnelheid bevordert een compacte opbouw; de
tijd bestreken door enkel het uitzenden van een gegevenspakket stimuleert
dan weer een grote overdrachtscapaciteit.
Een compact computersysteem begrenst de fysieke doorsnede die elke ver-
binding ter beschikking heeft. Voor een gegeven lengte en vermogenbudget
beperkt de elektrische weerstand van zelfs de beste geleiders de elektrisch
haalbare bandbreedte per eenheid van doorsnede.
In computerchips in complementaire-metaaloxidehalfgeleidertechnologie (Eng.
CMOS) neemt de bereikbare transistordensiteit voortdurend toe. Dit laat een
grotere densiteit en snelheid van berekeningen toe. De overdrachtscapaciteit
en de latentietijd van de elektrische verbindingen kan de ontwikkelingen
echter maar amper volgen. Gedistribueerde algoritmen met sterk verbonden
knopen zijn thans onderhevig aan een verbindingsgerelateerde prestatiegrens.
Parallelle optische korteafstandsverbindingen bieden een mogelijke uitweg
waar de begrenzing van elektrische overdrachtscapaciteit door de verbindings-
densiteit een probleem vormt. Op de schaal van een computersysteem is de
vi Samenvatting
overdrachtscapaciteit van zelfs nauwe optische verbindingen immers eerder
begrensd door de bandbreedte van actieve componenten dan door verliezen
en dispersie van signalen in het optisch pad.
Tegenwoordig begroot men de evenwichtsafstand waarboven optische verbin-
dingen efficiënter zijn dan elektrische verbindingen op enkele centimeters wat
de overdrachtscapaciteitsbegrenzing per eenheid van doorsnede betreft.
Optische poorten aan het chipoppervlak In dit werk wordt de aanpak van
zeer grootschalige opto-elektronische integratie (Eng. OE-VLSI) beschouwd. Deze
verrijkt een chipoppervlak met optische poorten om elektrische densiteits- en
overdrachtscapaciteitsproblemen in de onmiddelijke chipomgeving te omzei-
len. De OE-VLSI–methode kan overdrachtscapaciteiten opleveren van ettelijke
honderden gigabits per seconde doorheen een meterslange verbinding met
een doorsnede van slechts enkele vierkante millimeter.
Het OE-VLSI–systeem dat hier centraal staat werd ontwikkeld in het project
Interconnect by Optics (IO) van het vijfde kaderprogamma van de Europese
Commissie (EC). De vertaling tussen elektronische en optische signalen wordt
verzorgd door halfgeleiderwafels met een tweedimensionale opstelling van
oppervlakte-uitstralende lasers met verticale caviteit (Eng. VCSELs) of fotodiodes
van het p-i-n–type. Via flip-chipmontage worden de wafels gekoppeld aan
de te verbinden CMOS-chips, die daartoe uitgerust zijn met flip-chipeilandjes
en speciale interfaceschakelingen. Het optisch pad wordt voorzien door een
meeraderige optische vezelbundel via een venster in de chipbehuizing. In het
IO-project werd gebruik gemaakt van wafels met een opstelling van 8 bij 8
opto-elektronische bouwstenen op een onderlinge afstand van 250 µm. Een
overdrachtssnelheid van 2.5 gigabits per seconde per kanaal bleek mogelijk.
Deze thesis Het omzeilen van de overdrachtscapaciteitsproblemen nabij
elektrische chippoorten door een OE-VLSI–benadering neemt niet weg dat
de puur elektrische aanpak relatief gezien veel beter bestudeerd werd en
er gevestigde oplossingen voor bestaan. Daarentegen komen er een aantal
ultramoderne technieken aan te pas bij het enkel operationeel maken van
een hoogparallel OE-VLSI–systeem; de complexiteit van alleen al de fysieke
opbouw van een dergelijk systeem is indrukwekkend.
Dit proefschrift bekijkt de aanwezige complexiteit vanuit een ander oogpunt:
datgene van een systeemontwerper die een OE-VLSI–oplossing in overweging
neemt. De hoofdbedoeling van dit werk is de integratie van een OE-VLSI–
systeem in een digitaal ontwerp te vergemakkelijken. Daartoe worden een
aantal methodologische en operationele aspecten onder de loep genomen.
Computerondersteuning voor OE-VLSI–ontwerp We bespreken een aantal
methodologische kwesties die verband houden met het verlenen van steun
Samenvatting vii
voor OE-VLSI–systemen in hulpmiddelen voor de automatisering van elektro-
nisch ontwerp (Eng. EDA tools). Verschillende stappen in het ontwerp van
OE-VLSI–systemen kunnen hier voordeel uit halen. We stellen een analyse
van benodigde basisondersteuning voor, bespreken de mogelijke aanpakken
hiertoe en verwijzen naar relevante verwezenlijkingen. Deze basisonder-
steuning behelst het virtueel aanleggen van het syteem, systeemsimulatie,
het afleiden van eigenschappen op systeemniveau en de verkenning van de
ontwerpruimte.
We hebben simulatiemodellen voor OE-VLSI–componenten op elektronische-
schakelingniveau beschreven in de gemengde-signaalabstractietaal Verilog-
AMS en deze geïntegreerd in een bestaand EDA-kader. De verschillende
simulatiemodellen kunnen aan elkaar gekoppeld worden om optische ver-
bindingen te simuleren. De integratieproblematiek voor multidisciplinaire
(elektrische, optische, thermische) en soms slecht geconditioneerde differenti-
aalvergelijkingen (zoals de stromingsvergelijkingen voor de beschrijving van
lasers) in een simulatiesysteem met een historisch puur elektrische toewijding
wordt besproken.
Onze modellering op elektronische-schakelingniveau heeft de fundamenten
geleverd voor twee afstudeerwerken over de verdere karakterisering en de
optimalisatie van de onderdelen van een OE-VLSI–systeem. In de context van
het PICMOS project van de EC over optische verbindingen binnenin chips
hebben we een simulatiemodel voor een miniscule laserdiode beschreven in
Verilog-AMS. Dit model is in dat project verder gebruikt bij de constructie van
een simulatiegebaseerde prestatievoorspeller voor de systematische synthese
van optische verbindingen binnenin chips.
Statistische OE-VLSI–modellering en karakterisering Bij de aanvang van
een digitaal systeemontwerp duiken vele vragen op. De eventuele geschiktheid
van een OE-VLSI–systeem voor een digitaal ontwerp hangt af van een aantal
cruciale eigenschappen. Hoe staat het met het tijdsgedrag en de integriteit van
doorgestuurde signalen? Hoe groot is het latentietijdsverchil tussen synchroon
verstuurde signalen van eenzelfde parallelle verbinding aan diens uitgang?
Vereist de optische verbinding een evenwicht tussen nullen en enen in de
digitale gegevensstroom? Welke overdrachtscapaciteit is bereikbaar? Waar
gaat de energie naartoe en hoeveel vermogen is vereist voor een optimaal
resultaat?
Het antwoord op dergelijke vragen is doorslaggevend voor de aantrekkings-
kracht van het OE-VLSI–systeem en bepaalt de haalbaarheid van verschillende
tijdsmodellen en de nood aan speciale voorzieningen voor signaalcodering.
We hebben de samenhang onderzocht tussen de vele efficiëntie- en nauwkeu-
righeidsaspecten van een complexe OE-VLSI–samenbouw en overkoepelende
verbindingseigenschappen zoals signaalverzwakking, verschuiving, latentie-
tijd, bibber, ruis en de daaruit voortvloeiende bitfoutfrequentie (Eng. BER).
viii Samenvatting
Hiertoe werd een uitgesproken statistiche modellerings- en karakterisering-
inspanning ondernomen. We hebben praktische stochastische modellen ont-
wikkeld die het gedrag en de uniformiteit in kaart brengen van de verschil-
lende optische verbindingsonderdelen: aanstuurschakeling, VCSEL, optisch
pad en ontvangerschakeling. Deze modellen gaan specifiek over aspecten
van signaalniveaus, tijds- en ruisgedrag. De gebruikte methodiek is algemeen
inzetbaar; kwantitatieve analyses steunen op een grondige meetcampagne
op echte OE-VLSI–hardware (uit het IO-project). Voor een optimaal inzicht
werd in deze meetcampagne de parallelle optische verbinding in zoveel mo-
gelijk stukken verdeeld. Er werd geknipt tussen aanstuurschakelingen en
lasers, voor en achter de uiteinden van de optische vezelbundel en tussenin
fotodiodes en ontvangerschakelingen.
Veel aandacht gaat uit naar de statistische modellering van de rechtstreekse-
koppelingsefficiëntie tussen een laserwafel en het uiteinde van de optische
vezels van een meeraderige optische stekker. Hiertoe hebben we een drieassige
servo-aandrijving gerealiseerd voor een instelbaar verband tussen de positie
van een enkele optische vezel en een OE-VLSI–chipbehuizing. Dit systeem
heeft autonoom het positieafhankelijk gekoppeld vermogen kunnen optekenen
tussen alle 8×8 bouwstenen van een laser- of fotodiodewafel en een optische
vezel.
We hebben de verkregen meetresultaten gebruikt voor de karakterisering
van een eenvoudig stralenmodel voor de te verwachten koppelingsefficiëntie.
Een stochastisch model (dat rekening houdt met procesvariaties) voor het
positieafhankelijke gekoppeld vermogen tussen een VCSEL en een vezel werd
ook gekarakteriseerd.
Wanneer een optische vezelbundel van een stekker wordt voorzien, komen de
relatieve posities van de vezeluiteinden en de fixeerelementen van de stekker
vast te liggen. Bij een OE-VLSI–chipbehuizing waar een stekker insteekt
zijn de verscheidene afwijkingen van het ideale positieverband tussen opto-
elektronische bouwstenen en vezels dan ook sterk gecorreleerd. Wij hebben
onze statistische positieafhankelijke VCSEL-vezelkoppelingskarakterisering
gecombineerd met een stochastisch model van de gezamenlijke onderlinge
positionering van vezeluiteinden en opto-elektronische bouwstenen in een
OE-VLSI–chipbehuizing met ingestoken stekker. Dit laatste model werd
ontwikkeld en gekarakteriseerd voor OE-VLSI–chipbehuizingen en optische
stekkers van het IO-project (een verwezenlijking van collega-onderzoeker
Olivier Rits).
Voor de onderzochte hardware blijken zowel de variabiliteit tussen VCSELs
als de positioneringsonzekerheid significant te zijn. Correlaties tussen aspec-
ten van verschillende kanalen worden teweeggebracht door een substantiële
gezamenlijke positieverschuiving van vezels. De invloed van een gezamen-
lijke draaiing blijkt in het niets te verzinken tegenover de procesvariabiliteit
van VCSELs. Tot slot hebben we een goed hanteerbaar stochastisch model
voor gezamenlijke laser-vezelkoppeling opgesteld en de distributies ervan
Samenvatting ix
gekarakteriseerd.
Het combineren van de gekarakteriseerde modellen van alle verbindingson-
derdelen liet ons toe om de overkoepelende signaalverzwakking, latentietijds-
verschillen, bibber, ruis en de daaruit voortvloeiende BER te kwantificeren.
Hierbij komt niet enkel een statistische evaluatie van een enkel kanaal aan
bod, maar ook de statische afhankelijkheden tussen verschillende kanalen uit
eenzelfde bundel.
We hebben meteen gebruik gemaakt van de resultaten om na te kijken of
een bronsynchroon tijdsmodel met een toegewijd klokkanaal haalbaar is over
een OE-VLSI–verbinding met een hoog fysiek parallelisme; dit kan kostelijke,
normaal kanaalgebonden hersynchronisatieschakelingen beperken tot één
instantie. Het eventueel toelaten van een langdurig onevenwicht tussen het
aantal enen en nullen in een digitale gegevensstroom werd ook onderzocht.
Voor de hardware uit het IO-project blijkt een gedeelde logische drempel,
een voorziening die het gewenste onevenwicht toelaat, te vaak bitfouten te
introduceren. Een efficiënt bronsynchroon tijdsmodel met een toegewijd
klokkanaal is haalbaar tegen een kleine overdrachtscapaciteitskost.
Substraatruiskoppeling tussen digitale CMOS en OE-VLSI–ontvangers
In een OE-VLSI–systeem komen optische-ontvangerschakelingen op de chip
terecht naast elkaar en frequent schakelende digitale logica. In deze situatie
bestaat de kans dat een doorkoppeling van digitale schakelruis via het sub-
straat tot aan de ontvangerschakelingen de juiste werking van deze laatste
verhindert. Substraatruis is een probleem dat veel verder reikt dan OE-VLSI–
systemen: het duikt op in alle gemengde-signaalabstractieontwerpen waar
gevoelige analoge schakelingen ingebed worden in een vijandige digitale
omgeving.
We hebben het koppelen van plaatselijke CMOS-substraatruis met schake-
lingen in de ogenblikkelijke omgeving onderzocht. Hiertoe werden configu-
reerbare schakelingen voor substraatruisgeneratie en een ruismeetschakeling
gerealiseerd in CMOS-technologie (0.18 µm met hoogresistief substraat) en
geplaatst naast gevoelige optische-ontvangerschakelingen van het IO-project.
Een grote meetbandbreedte voor substraatruis werd waargenomen; deze wordt
verklaard vanuit de gedetailleerde werking van de meetschakeling. Ruispie-
ken met een totale duur van slechts 200 ps werden nauwkeurig opgetekend.
Daarenboven hebben we een benadering uitgewerkt om onnauwkeurigheden
veroorzaakt door een ruizige referentiespanning of bibberende bemonsterings-
tijdstippen weg te werken.
Wanneer men een gedetailleerde simulatie van subtraatruiskoppeling probeert
te verkrijgen, beperkt de enorme hoeveelheid berekeningen die vereist is in
actuele simulatoren sterk de behandelbare grootte van te analyseren schake-
lingen. We hebben simulaties van onze ruisschakelingen kunnen uitvoeren
mits een aantal noodzakelijke vereenvoudigingen, waarbij toch een goede
x Samenvatting
overeenkomst tussen metingen en simulaties werd behaald.
De ontvangerschakelingen bleken in voldoende mate afgeschermd tegen plaat-
selijke ruisinjectie. De oorzaak van een toch geobserveerde ruisinkoppeling
kon worden geïdentificeerd als het ontbreken van een afscherming voor de
contacteereilandjes van een delicate instelspanning. Onze resultaten beves-
tigen dat een schermring volstaat om plaatselijke substraatstoringen op te
vangen. Hierdoor beperkt het sustraatruisprobleem zich tot de typische afwij-
king tussen de potentiaal van aardings-/voedingsknopen binnen en buiten de
chip. Dit probleem kwam in alleenstaande ontvangerschakelingen reeds voor
en kan aangepakt worden met gevestigde differentiële ontwerptechnieken.
Summary
Optical interconnections inside the box? Electrical and optical interconnec-
tions are the foremost resources for guided-wave digital communication. The
length, throughput and application of an interconnection are decisive for the
choice between the alternatives.
Optical interconnect has been commercially introduced in the late 1970s for
high-bandwidth multi-km telecommunications due to its vast power and
bandwidth advantages: the attenuation is essentially bandwidth independent
and much lower when compared to electrical interconnect. Since then, the
increasing bandwidth demands of end user applications and the rise of smaller
and cheaper optoelectronics has promoted ever smaller optical link lengths,
presently down to a few meters.
Electrical interconnect is the native interface of logic and memory primitives
inside computational systems. Here, the limiting factor for system perfor-
mance is often the interconnection latency. The light velocity encourages a
spatially dense setup, whereas the time necessary to modulate a data packet
on a link translates again into a bandwidth requisite.
In a dense computational system, the cross-section available to an interconnec-
tion is limited. The resistivity of even the best available conductors limits the
electrically attainable bandwidth through a confined cross-section, given the
distance and power budget.
The transistor density of integrated circuits in complementary metal-oxide–
semiconductor (CMOS) technology continues to rise. Although more calcu-
lations can be performed in the same space and time, the bandwidth and
latency of electrical interconnections can hardly keep pace. Strongly con-
nected distributed algorithms face an actual interconnect-related performance
bottleneck.
Parallel short-range optical interconnections can provide a solution to the
bandwidth density problem. After all, on a system scale, the throughput
of even tight optical interconnections is limited by the bandwidth of active
components rather than the attenuation or dispersion of the optical path.
The break-even distance beyond which optical interconnect can outperform
electrical interconnect, with respect to bandwidth density at a similar power
xii Summary
budget, is presently estimated in the centimeter range.
Surface-level optical chip access We focus on the optoelectronic very large
scale integration (OE-VLSI) approach, which provides a surface-level optical
chip access to circumvent the bandwidth issues in the escape perimeter of a
dense electrical chip access. In an OE-VLSI approach, the data rate through a
square-millimeter–scale cross-section can reach several hundreds of gigabits
per second for lengths up to a few meters.
The OE-VLSI system being focused on has been developed as a part of the
Interconnect by Optics (IO) project of the European Commission’s (EC’s) Fifth
Framework Programme. Two-dimensional arrays of vertical-cavity surface-
emitting lasers (VCSELs) and positive-intrinsic-negative (p-i-n) photodiodes con-
vert between electrical and optical signals. These arrays are flip-chip bonded
to the CMOS ICs to be linked, which are equipped with flip-chip pads and
special interfacing circuitry. Two-dimensional arrays of optical fibers provide
the optical path through a window in the chip package. In the IO project, 8× 8
arrays were used at a 250–µm device pitch. A 2.5–Gbps/channel signaling
rate was obtained.
This thesis Although the limitations of a dense electrical IC access are
avoided, the intricacies are much better known and solution methodologies
are far more established in the electrical case than in the OE-VLSI case.
The complexity of just the physical assembly of a highly parallel OE-VLSI
manifestation is impressive, as a number of cutting-edge techniques are
required just to make things work.
This dissertation focuses on another kind of complexity—that which is pre-
sented to a system designer considering an OE-VLSI approach. The main
motivation of this work is to facilitate the integration of an OE-VLSI setup
in a digital system, assessing methodological and operational aspects where
possible.
Design automation issues We discuss a number of methodological issues
concerning electronic design automation (EDA) tool support for OE-VLSI
systems. The design of OE-VLSI systems consists of several steps, each
needing some form of EDA support. A breakdown of fundamental design
automation support is presented—design creation, simulation, extraction of
system-level properties and design space exploration—the methodologies
involved and references to relevant realizations.
We have implemented circuit-level simulation models for the components of
an OE-VLSI link in the mixed-signal language Verilog-AMS and integrated
them into an EDA framework. The different simulation models can be chained
together to simulate optical interconnect. The intricacies concerning the
integration of multidisciplinary—electrical, optical, thermal—and sometimes
Summary xiii
badly conditioned differential equations—e.g., laser rate equations—into a
simulation system bearing the marks of an electrical circuit dedication are
discussed.
The circuit-level modeling work performed lies at the basis of two Master’s
theses on the characterization and tuning of OE-VLSI components. In the
context of the EC’s PICMOS project on intra-chip interconnect, we have
implemented a microlaser simulation model in Verilog-AMS, which has been
used in the project for the systematic simulation-based predictive synthesis of
integrated optical interconnect.
Statistical OE-VLSI modeling and characterization At the creation of a
new digital system, several questions need to be addressed. In order to even
bring OE-VLSI into the picture, it is vital to know what one can do with the
parallel optical link provided. What about the signal timing and integrity?
How much skewed are signals of adjoining links at the link exit, if they were
synchronous at the link entrance? Should the optically signaled data be dc-
balanced, as it is in telecommunication applications? What is the attainable
bandwidth? How much power to spend, and where does it go?
Answers to questions like these determine the attractiveness of the system,
and are key to the feasibility of—or need for—different approaches concerning
the encoding of the data and the signal timing approach taken.
We have examined the connection between the many efficiency and accuracy
aspects of a sophisticated OE-VLSI assembly and resulting across-the-board
optical interconnect characteristics such as overall signal attenuation and offset,
latency, jitter, noise and ensuing bit error ratio (BER) figures.
A modeling and characterization approach of a strong statistical nature was
adopted. We have developed simple yet adequate stochastic models to cap-
ture the behavior and uniformity of the different subsystems of the optical
interconnect—driver circuit, VCSEL, optical path, photodiode and receiver
circuit. These models specifically address amplitude, timing and noise behav-
ior. The methodology is generally applicable; for the purpose of quantitative
analyses our models have been characterized based on detailed and extensive
measurements of the actual performance on IO project hardware. In order
to capture as much information as possible, this measurement approach has
been one of dividing the parallel optical link in as many parts as possible—by
cutting in between driver and laser arrays, in front of and behind the fiber
assembly, and in between photodiode and receiver arrays.
Major attention is given to the statistical modeling of the butt coupling ef-
ficiency between a VCSEL array and a multi-fiber connector. To this end, a
3-axis positioning setup for a controllable alignment of an optical fiber and an
OE-VLSI module has been realized. This system was autonomously able to
accurately scan the alignment-dependent coupled power between all devices
of an 8×8 VCSEL (or photodiode) array and a fiber.
xiv Summary
We have used the acquired measurements to characterize a simple ray tracing
model for the expected coupling efficiency on an 8×8 VCSEL array. A stochas-
tic model (taking process variations into account) for the alignment-conditional
power coupling between a VCSEL and a fiber has been characterized as well.
When a fiber bundle is connectorized, the positions of the fiber facets are
mutually fixed relative to each other as well as relative to the alignment
features of the connector. When the connector is plugged into an OE-VLSI
package, the misalignment of device-fiber pairs at different positions of an
O-E array is therefore strongly correlated. We have combined our alignment-
conditional VCSEL-fiber coupling characterization with a stochastic model
of the array-wide alignment of a connectorized fiber bundle to an O-E array.
The latter model was developed and characterized for OE-VLSI packages and
connectors of the IO project (courtesy of fellow researcher Olivier Rits).
On the IO project hardware, it turns out that both the inter-VCSEL variability
and the fiber bundle alignment uncertainty are significant. Effects with array-
wide correlation are produced by significant global translational misalignment,
and the smaller impact of global rotational misalignment turns out to be
dwarfed by the VCSEL process variations. In conclusion, we have derived a
practicable stochastic model for the array-wide laser-fiber coupling and have
characterized its distributions.
By bringing all characterized component models together, we have achieved
the quantification of all-inclusive signal attenuation and offset, latency devia-
tion, jitter, noise and ensuing BER figures, not only including the statistical
evaluation of isolated optical channels, but also capturing the statistical de-
pendencies between different channels juxtapositioned within the same array
or package.
The results obtained have been directly put to work to evaluate the option of
using true source-synchronous signaling (with a dedicated clock channel) over
optical interconnect with a high physical parallelism, reducing the substantial
per-channel clock synchronization circuitry to one instance. We have also
looked into dc-unbalanced signaling to remove the need for data coding.
For the IO project hardware, the usage of a common logic threshold across
all channels, required for dc-unbalanced signaling, appears infeasible after all
models are combined. Efficient true source-synchronous signaling turns out
to be in reach in carefully designed systems.
Substrate noise coupling between digital CMOS and OE-VLSI receivers
In an OE-VLSI approach, optical receiver circuits are integrated in a chip
alongside each other and rapidly switching digital circuits. In this situation,
coupling of digital switching noise through the substrate can compromise the
accurate operation of the receiver circuits. Besides its relevance for OE-VLSI
systems, substrate noise is an important issue in all mixed-signal designs
where sensitive analog circuits are embedded in a hostile digital environment.
Summary xv
We have researched the intrusion of localized CMOS substrate disturbances
to nearby positions. To this end, we have implemented adjustable substrate
noise generation circuits and a noise measurement circuit next to sensitive
receiver circuits of the IO project in 0.18–µm bulk type CMOS.
We have achieved a large substrate noise measurement bandwidth, which is
explained from the detailed circuit behavior. The accurate wave tracing of
pulses as narrow as 200ps has been demonstrated. Furthermore, we have
devised an approach to mitigate measurement inaccuracies resulting from a
noisy voltage reference or sampling time jitter.
When one tries to simulate noise coupling effects in detail, the computational
overhead of existing substrate noise simulation tools is tremendous for all
but the smallest designs. We have performed such simulations, necessarily
simplified to a certain degree, and a good agreement between measurement
and simulation was obtained.
The receiver circuits turned out to be adequately shielded against localized
substrate disturbances, yet we were able to identify one remaining disruptive
path to the unshielded bond pads of a sensitive bias net. We could confirm that
a guard ring protection of receiver circuits suffices to absorb local substrate
disruptions. This limits the problem of substrate noise coupling to global
supply and ground bounce. This is a problem which had already to be
taken into account in standalone receivers; it can be dealt with through an
established differential circuit design.
xvi Summary
Examencommissie
Examination commission
prof. dr. ir. Ronny Verhoeven prof. dr. ir. Ronny Verhoeven
Voorzitter Chairman
Onderwijsdirecteur Director of Education
Faculteit Ingenieurswetenschappen Faculty of Engineering
dr. ir. Joni Dambre dr. ir. Joni Dambre
Secretaris Secretary
Vakgroep ELIS ELIS Department
Faculteit Ingenieurswetenschappen Faculty of Engineering
dr. ir. Richard Annen dr. ir. Richard Annen
Miromico AG Miromico AG
Zürich, Zwitserland Zürich, Switzerland
prof. dr. ir. Roel Baets prof. dr. ir. Roel Baets
Vakgroep INTEC INTEC Department
Faculteit Ingenieurswetenschappen Faculty of Engineering
dr. ir. Christof Debaes dr. ir. Christof Debaes
Vakgroep TONA TONA Department
Faculteit Toegepaste Wetenschappen Faculty of Applied Sciences
Vrije Universiteit Brussel Vrije Universiteit Brussel
prof. dr. ir. Pieter Rombouts prof. dr. ir. Pieter Rombouts
Vakgroep ELIS ELIS Department
Faculteit Ingenieurswetenschappen Faculty of Engineering
prof. dr. ir. Jan Van Campenhout prof. dr. ir. Jan Van Campenhout
Promotor Advisor
Vakgroep ELIS ELIS Department
Faculteit Ingenieurswetenschappen Faculty of Engineering
xviii Examencommissie
List of symbols
·ˆ indicates an estimator
〈·〉 average over all components of a vector
α absorption coefficient in the Urbach region
αm transmission coefficient of a DBR
β spontaneous emission coefficient; decision threshold bias
βd,k k-th sample of the decision threshold bias for array channel d
βauto automatically inferred decision threshold bias
γ damping coefficient
∆AVG array-wide deviation of the optical modulation amplitude of a
VCSEL array (in dB)
∆AVGd deviation of the optical modulation amplitude of a VCSEL array at
position d (in dB)
∆I deviation of the VCSEL current from the operating point
∆Iin photocurrent swing
∆N deviation of the number of excited carriers in the active region of a
VCSEL from the operating point
∆OMA array-wide deviation of the optical modulation amplitude of a
VCSEL array (in dB)
∆OMAd deviation of the optical modulation amplitude of a VCSEL array at
position d (in dB)
∆S deviation of number of photons confined in a VCSEL from the
operating point
∆t sampling time step
∆Vout output voltage swing of a TIA
δ expectation of Tcd,d
e gain compression factor; band gap energy
H, η differential VCSEL efficiency
ηv coupling efficiency of the v-th VCSEL given the current and fiber
position
Θ MOSFET threshold mismatch in a current mirror
θ azimuthal component of r
θd MOSFET threshold mismatch in the d-th current mirror
λ wavelength; time constant of a latch comparator
xx List of symbols
µ expected value (often the random variable is written in the
subscript)
ν frequency of electromagnetic radiation
pi the ratio of a circle’s circumference to its diameter
ρ correlation between two random variables (indicated in substript or
clear from the context)
σ standard deviation (often the random variable is written in the
subscript)
σall standard deviation of the analog modulation amplitude of an optical
link signal from end to end
σN,d,k k-th sample of the standard deviation of the amplitude noise for
array channel d
τ time constant in general; specifically related to: voltage amplifier;
jitter in an equivalent time measurement
τN carrier lifetime
τS photon lifetime; length of a sampling window
Φ cumulative distribution function of the standard normal distribution
φ phase
φ0 frequency dependent phase response
ω pulsation
ωR relaxation oscillation frequency
A the array-wide VCSEL-fiber alignment
a frequency dependent magnitude response
ak k-th sample of the array-wide VCSEL-fiber alignment
AVG fiber-coupled average optical power over a full VCSEL array
(random variable across VCSEL arrays)
AVGd fiber-coupled average optical power for the VCSEL at a certain
position in an array (random variable across VCSEL arrays)
avgd,k k-th sample of the fiber-coupled average optical power for array
channel d
AVG, avg fiber-coupled average optical power (random variable across
individual VCSELs; specific value)
avgv fiber-coupled average optical power of the v-th VCSEL
B bandwidth of a single electrical link
B0 bandwidth factor [Miller and Özaktaş, 1997]
Bamp voltage amplifier bandwidth
BER bit error ratio
C capacitance in general; specifically: photodiode depletion and input
MOSFET capacitance of a TIA
Ca parasitic capacitance in the active region of a VCSEL
Cdep photodiode depletion capacitance
d serially numbered position in a VCSEL array; serial number of the
physical VCSEL driver circuit being characterized (only in section
4.2.1); substrate thickness of VCSEL and photodiode arrays;
parabola parameter
List of symbols xxi
E expectation
e Euler’s number
fτ pdf of the jitter in an equivalent time measurement
fn pdf of the additive noise in an equivalent time measurement
G gain factor
Gamp voltage amplifier gain
gm transconductance
H amount of information; modulation transfer function
H0 modulation transfer function scale factor
h Planck’s constant
I indicator function
I, i current in general; specifically: injected current in the active region
of a VCSEL; photocurrent
I0, i0 VCSEL off-current; VCSEL current at an operating point
I1, i1 VCSEL on-current
i1,aim targeted VCSEL on-current
i1,d actual on-current measured in the d-th VCSEL driver
Id dark current
Id,c temperature-independent dark current contribution
Idiff difference in drain currents between the sides of a latch comparator
Imod, imod VCSEL modulation current
imod,aim targeted VCSEL modulation current
Ith, ith VCSEL threshold current
J timing jitter
j imaginary unit
K MOSFET transconductance mismatch in a current mirror
k Boltzmann constant
kd MOSFET transconductance mismatch in the d-th current mirror
L actual packet latency of an electrical link; outbound optical power
Ls series inductance of a photodiode
l average length of a VCSEL cavity roundtrip; incident optical power
on a photodiode
lv total output power of the v-th VCSEL given the current
M0, M1 moment (mathematics)
m zero-crossing slope of a waveform signal; frequency of 1s in a
repeated equivalent-time measurement
mv measured optical power of the v-th VCSEL given the current and
fiber position
N number of electron-hole pairs in the active region of a VCSEL;
additive noise signal; normal distribution; number of time steps
N0 number of electron-hole pairs in the active region of a VCSEL at an
operating point
Ne VCSEL diode saturation carrier density
Nk additive noise signal at the output of the k-th TIA
xxii List of symbols
NRX,k modulation amplitude independent additive TIA noise contribution
at the output of the k-th TIA
NTX modulation amplitude dependent additive TIA noise contibution
factor of oma[mW]
Ntr transparency carrier count
n emission coefficient; additive noise in an equivalent time
measurement
nd number of VCSEL positions in an array; number of parallel data
channels
nv number of VCSELs characterized
OMA fiber-coupled optical modulation amplitude over a full VCSEL array
(random variable across VCSEL arrays)
OMAd fiber-coupled optical modulation amplitude for the VCSEL at a
certain position in an array (random variable across VCSEL arrays)
omad,k k-th sample of the fiber-coupled optical modulation amplitude for
array channel d
OMA, oma fiber-coupled optical modulation amplitude (random variable across
individual VCSELs; specific value)
omav fiber-coupled optical modulation amplitude of the v-th VCSEL
P probability
p parabola
p1–p4 signal waveform of the sequences 010, 011, 110, 111
q elementary charge
q1–q4 signal waveform of the sequences 000, 001, 100, 101
R photodiode responsivity; resistance in general; specifically:
transimpedance
R0 maximal photodiode responsivity
Ra parasitic resistance in the active region of a VCSEL
Rs parasitic series resistance of a VCSEL or photodiode
r fiber misalignment (r, θ, z) relative to the center of a VCSEL
r radial component of r
S number of photons confined in a VCSEL
S0 number of photons confined in a VCSEL at an operating point
S11 input port reflection coefficient (scattering parameter)
S21 forward gain (scattering parameter)
T time of flight + intrinsic delay of a link; temperature;
equivalent-time sampling aperture duration
Tcd,d, tcd,d delay of the d-th receiver circuit plus the clock distribution delay to
the d-th channel
Ts time difference between the sampling epoch and the center of a
signal eye
Ts,d Ts of channel d
t time
ts,µJ expectation of Ts
ttop time at a noise pulse peak
List of symbols xxiii
V voltage (often subscripted with a node name)
Vdiff difference in drain voltages between both input transistors of a latch
comparator
Vin input voltage of a latch comparator
Vref reference voltage of a latch comparator
Vt MOSFET threshold voltage
v serial number of the physical VCSEL being characterized; true
voltage waveform
v0 specific Vref value
v1 bias voltage for VCSEL on-current distribution
vg average photon velocity inside a VCSEL
vmod bias voltage for VCSEL modulation current distribution
vtop height of a noise pulse peak
w0 minimal spot size of a Gaussian beam
z vertical component of r (elevation of the fiber over the array surface)
zref location of the waist of a Gaussian beam below the array surface
xxiv List of symbols
List of abbreviations
AC alternating current
A/D analog to digital
AMS analog and mixed-signal (also: Austria Mikro Systeme)
ASIC application-specific integrated circuit
AVG average
BER bit error ratio
BGA ball grid array
BSC boundary scan cell
cdf cumultive distribution function
CMOS complementary metal–oxide–semiconductor
CV coefficient of variation
DBR distributed Bragg reflector
DC direct current
DDR double data rate
DTA digital technology assessment
EDA electronic design automation
EM electromagnetic
EMI electromagnetic interference
EPI epitaxial
ESD electrostatic discharge
FC/PC ferrule connector / physical contact
FET field-effect transistor
FPGA field programmable gate array
FR-4 flame resistant 4
GI gradient index
HDL hardware description language
HPC high performance computing
IC integrated circuit
III-V groups of the Periodic Table
IO Interconnect by Optics
I/O input/output
IP intellectual property or internet protocol
ITRS International Technology Roadmap for Semiconductors
xxvi List of abbreviations
JTAG Joint Test Action Group
LED light-emitting diode
LI light–current
LVDS low voltage differential signaling
LVS layout versus schematic
MC Monte Carlo
MCM multi-chip module
MOS metal–oxide–semiconductor
MOSFET metal–oxide–semiconductor field-effect transistor
MPO multipath push-on
MPW multi-project wafer
MQW multiple quantum well
MT mechanically transferable
NA numerical aperture
ODE ordinary differential equation
O-E optical-electrical
OE-VLSI optoelectronic very large scale integration
OIIC Optically Interconnected Integrated Circuits
OMA optical modulation amplitude
OSI Open Systems Interconnection
p-i-n positive–intrinsic–negative
PC personal computer
PCB printed circuit board
PCI Peripheral Component Interconnect
PCIe Peripheral Component Interconnect Express
PDE partial differential equation
pdf probability density function
PGA pin grid array
PICMOS Photonic Interconnect Layer on CMOS by Waferscale Integration
POF plastic optical fiber
PRBS pseudorandom binary sequence
RAM random access memory
RC resistor/capacitor
RCLED resonant-cavity light-emitting diode
RLC resistor/inductor/capacitor
RX receiver, reception
SAV surface abstract view
SEM scanning electron microscope
SiP system-in-a-package
SMP symmetric multiprocessing
SNA Substrate Noise Analyst
SNAP12 name of 12-channel parallel optical connector specification
SPICE Simulation Program with Integrated Circuit Emphasis
SRAM static random access memory
TIA transimpedance amplifier
List of abbreviations xxvii
TIR total internal reflection
TX transmitter, transmission
UV ultraviolet
VCSEL vertical-cavity surface-emitting laser
VHDL very–high-speed integrated circuit hardware description language
WDM wavelength division multiplexing
xxviii List of abbreviations
Contents
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Dutch summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
English summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Examination commission . . . . . . . . . . . . . . . . . . . . . . . . . xvii
List of symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
List of abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxix
1 Introduction 1
1.1 Digital communication . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Interconnect terminology . . . . . . . . . . . . . . . . . . . . . 2
1.3 Electromagnetic radiation as an information carrier . . . . . . 3
1.3.1 Free-space approach . . . . . . . . . . . . . . . . . . . . 3
1.3.2 Guided-wave approach . . . . . . . . . . . . . . . . . . 5
1.3.2.1 Electrical interconnect . . . . . . . . . . . . . . 5
1.3.2.2 Waveguides . . . . . . . . . . . . . . . . . . . . 7
1.3.2.3 Optical interconnect . . . . . . . . . . . . . . . 8
1.4 Electrical versus optical interconnect . . . . . . . . . . . . . . . 10
1.4.1 Interconnect demands and typical realizations . . . . . 10
1.4.1.1 Telecommunication . . . . . . . . . . . . . . . 10
1.4.1.2 Computational systems . . . . . . . . . . . . . 11
1.4.2 Bandwidth limitations of closely packed electrical links 14
1.4.2.1 Fundamental bandwidth density limitations 14
1.4.2.2 Rising bandwidth density needs . . . . . . . 14
1.4.2.3 Extending the reach of established approaches 15
1.4.3 Optical interconnect: a superior bandwidth density
alternative . . . . . . . . . . . . . . . . . . . . . . . . . . 18
xxx Contents
1.4.3.1 Justifiable deployment at ever shrinking
distances . . . . . . . . . . . . . . . . . . . . . 18
1.4.3.2 OE-VLSI: tight optical interconnect integration 20
1.5 This thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6 Overview of achievements . . . . . . . . . . . . . . . . . . . . . 22
1.6.1 Circuit-level OE-VLSI component modeling . . . . . . 22
1.6.2 Statistical modeling and characterization of an OE-
VLSI system . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.6.3 Substrate noise . . . . . . . . . . . . . . . . . . . . . . . 25
1.7 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . 26
1.8 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2 OE-VLSI approach and prototype realization 31
2.1 Photonics and CMOS . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2 Light generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2.1 Generation or modulation? . . . . . . . . . . . . . . . . 32
2.2.2 VCSELs . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.3 Bottom-side emission . . . . . . . . . . . . . . . . . . . 34
2.2.4 Choice of wavelength . . . . . . . . . . . . . . . . . . . 35
2.3 Light detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.1 External or integrated? . . . . . . . . . . . . . . . . . . . 36
2.3.2 p-i-n photodiodes . . . . . . . . . . . . . . . . . . . . . . 36
2.4 Flip-chip bonding . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.5 Packaging and optical path . . . . . . . . . . . . . . . . . . . . 40
2.5.1 Optical path: build-up or PCB-integrated? . . . . . . . 40
2.5.2 MT/MPO-like approach . . . . . . . . . . . . . . . . . . 40
2.5.3 Package . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5.3.1 General scheme . . . . . . . . . . . . . . . . . 42
2.5.3.2 Silicon bench approach . . . . . . . . . . . . . 43
2.5.3.3 Three-dimensional index alignment approach 46
2.5.4 Optical path . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.6 O-E interfacing circuits . . . . . . . . . . . . . . . . . . . . . . . 46
2.6.1 Driver circuit . . . . . . . . . . . . . . . . . . . . . . . . 49
2.6.2 Receiver circuit . . . . . . . . . . . . . . . . . . . . . . . 50
Contents xxxi
3 Design automation issues 55
3.1 Methodological aspects . . . . . . . . . . . . . . . . . . . . . . . 56
3.1.1 Design creation . . . . . . . . . . . . . . . . . . . . . . . 56
3.1.2 Pointwise design simulation . . . . . . . . . . . . . . . 57
3.1.3 Modeling uncertainty . . . . . . . . . . . . . . . . . . . 57
3.1.4 Design space exploration . . . . . . . . . . . . . . . . . 58
3.2 Optical link building blocks at the CMOS level . . . . . . . . . 60
3.3 Design simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.3.1 Introducing photonics in circuit simulators . . . . . . . 63
3.3.1.1 Spatially independent models . . . . . . . . . 63
3.3.1.2 How to define light . . . . . . . . . . . . . . . 63
3.3.2 IP protection of interfacing circuits . . . . . . . . . . . . 64
3.3.3 VCSEL model . . . . . . . . . . . . . . . . . . . . . . . . 66
3.3.3.1 Elementary lumped model . . . . . . . . . . . 66
3.3.3.2 Advanced modeling aspects . . . . . . . . . . 70
3.3.3.3 Implementation issues . . . . . . . . . . . . . 72
3.3.4 Other models . . . . . . . . . . . . . . . . . . . . . . . . 74
3.3.4.1 Photodiode . . . . . . . . . . . . . . . . . . . . 74
3.3.4.2 Optical path . . . . . . . . . . . . . . . . . . . 74
3.4 Design for testability of optical links . . . . . . . . . . . . . . . 76
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4 Statistical OE-VLSI modeling and characterization 79
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.1.1 Statistical modeling and characterization approach . . 79
4.1.2 Optical path characterization . . . . . . . . . . . . . . . 80
4.1.3 Global interconnect parallelism evaluation . . . . . . . 81
4.1.4 Structure of this chapter . . . . . . . . . . . . . . . . . . 83
4.1.5 A note on notation . . . . . . . . . . . . . . . . . . . . . 83
4.2 From chip to light . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2.1 Drive current uniformity . . . . . . . . . . . . . . . . . 83
4.2.2 VCSEL array uniformity . . . . . . . . . . . . . . . . . . 86
4.2.2.1 Static uniformity . . . . . . . . . . . . . . . . . 86
4.2.2.2 Dynamic response . . . . . . . . . . . . . . . . 88
4.2.2.3 Spectral uniformity . . . . . . . . . . . . . . . 89
4.3 Light in flight . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.3.1 Sources of nonuniformity in the optical path . . . . . . 89
xxxii Contents
4.3.2 Experimental characterization of fiber coupling efficiency 91
4.3.2.1 Measurement setup . . . . . . . . . . . . . . . 91
4.3.2.2 Experimental results . . . . . . . . . . . . . . 91
4.3.3 Coupling efficiency estimation using a ray tracing model 96
4.3.4 Misalignment-conditional characterization of VC-
SEL process variations . . . . . . . . . . . . . . . . . . . 97
4.3.5 Stochastic evaluation of VCSEL-fiber power cou-
pling in a connectorized package . . . . . . . . . . . . . 99
4.4 From light to chip . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.4.1 Photodiode array uniformity . . . . . . . . . . . . . . . 102
4.4.1.1 Static uniformity . . . . . . . . . . . . . . . . . 102
4.4.1.2 Dynamic uniformity . . . . . . . . . . . . . . . 105
4.4.2 Receiver circuit delay and skew . . . . . . . . . . . . . 105
4.4.2.1 Introduction . . . . . . . . . . . . . . . . . . . 105
4.4.2.2 Receiver skew . . . . . . . . . . . . . . . . . . 106
4.4.3 Bit error ratio characterization across photodiode-
receiver pairs . . . . . . . . . . . . . . . . . . . . . . . . 108
4.4.3.1 Impact of timing jitter and offset on the bit
error ratio . . . . . . . . . . . . . . . . . . . . . 110
4.5 Putting things to work . . . . . . . . . . . . . . . . . . . . . . . 112
4.5.1 Common sampling instant . . . . . . . . . . . . . . . . 112
4.5.2 Common logic threshold . . . . . . . . . . . . . . . . . 114
5 Measurement and evaluation of direct substrate noise 117
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.2 Problem Definition and Previous Work . . . . . . . . . . . . . 119
5.2.1 Substrate noise . . . . . . . . . . . . . . . . . . . . . . . 119
5.2.2 Measuring fast substrate signals . . . . . . . . . . . . . 121
5.3 Estimating bandwidth and linearity of the latch comparator . 122
5.3.1 Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.3.2 Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.4 Measuring in the presence of jitter and additive noise . . . . . 128
5.5 Simulation setup . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 135
5.6.1 Measurement setup . . . . . . . . . . . . . . . . . . . . 135
5.6.2 Substrate noise evaluation . . . . . . . . . . . . . . . . . 138
5.6.3 Pulse reconstruction . . . . . . . . . . . . . . . . . . . . 141
5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Contents xxxiii
6 Overall conclusions and perspectives 147
6.1 Design automation integration . . . . . . . . . . . . . . . . . . 147
6.2 Statistical modeling of uniformity . . . . . . . . . . . . . . . . 148
6.3 Substrate noise . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.4 Future work and outlook . . . . . . . . . . . . . . . . . . . . . 149
References 151
A Project background 169
A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
A.2 OIIC project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
A.2.1 IO project . . . . . . . . . . . . . . . . . . . . . . . . . . 170
A.3 Relevant IO project developments . . . . . . . . . . . . . . . . 171
A.3.1 DTA IC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
A.3.2 Demonstrator PCBs . . . . . . . . . . . . . . . . . . . . 173
A.3.3 High-speed design in 0.18–µm CMOS . . . . . . . . . . 173
B Verilog-AMS simulation models 179
B.1 Simplified interfacing circuits . . . . . . . . . . . . . . . . . . . 179
B.2 VCSEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
B.2.1 Multimode linear laser model . . . . . . . . . . . . . . 181
B.2.2 Lumped nonlinear laser model . . . . . . . . . . . . . . 185
C Ray tracing code 191
D Calculation details 195
D.1 Derivation of a stochastic model for array-wide fiber-
coupled power . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
D.2 Calculation of the yield of an array-wide low BER when
using a common logic threshold . . . . . . . . . . . . . . . . . 196
D.3 Improved pulse position estimate . . . . . . . . . . . . . . . . . 197
D.4 Pulse width and position estimate around the peak . . . . . . 198
xxxiv Contents
Chapter 1
Introduction
1.1 Digital communication
Digital communication is the transport of digital data—anything which can be
encoded into a binary sequence—between physically separate locations. An
easily imaginable form of digital data and communication are the multime-
dia files and streams—text, audio, graphics, video and suchlike structured
content—which are sent through the Internet in large amounts each day. This
type of communication can span thousands of kilometers. Yet the possible
scale range in communication distances is immense: in contrast to the size of
the Internet, the travel distance of a single bit between two gates in a computer
processor chip can be shorter than 1 µm.
The distance-related categorization of interconnects in electronic systems
based on the smallest physical entity which contains them—an integrated
circuit (IC), (chip) package, printed circuit board (PCB), rack, enclosed system,
or a smaller or larger network—is called the physical interconnect hierarchy
(figure 1.1) [Benner et al., 2005]. Interconnect on different levels can be
very dissimilar—beyond the obvious length difference—in terms of their
application, synchronization protocol, physical implementation, bit rate, link
density, latency and reliability. A high variety of requirements, feasible
solutions and cost tradeoffs between hierarchy levels lies at the basis of this
diversity.
This thesis focuses on interconnect distances from a few centimeter to one
meter, corresponding to the PCB and inter-PCB levels, and an optical intercon-
nect technology to realize spatially very dense high-bandwidth links at these
levels. The specific subject of this thesis is the modeling and integration of
such interconnects. For a suitable situation of this work the following sections
provide a comprehensive introduction to different interconnect realizations in
general, with a focus on the demanding short links to which the scope of the
rest of this text is limited.
2 Introduction
large networks long cables short cables
multi-km 10–300m 1–10m
inter-PCB intra-PCB intra-package intra-chip
0.1–1m 0.1–0.3m 5–100mm 0–20mm
Figure 1.1: The physical interconnect hierarchy: digital interconnections can span
widely different distances. The inter- and intra-PCB (printed circuit board) levels—
indicated in bold—are focused on here. Figures taken from Benner et al. [2005] with
permission.
1.2 Interconnect terminology
Throughout this thesis we refer many times to ‘interconnect’ using a number
of related terms, which are described (or defined) here for clarity. The inter-
changeable terms (inter)connection and link designate the path along which
communication between different parties takes place. According to the Open
Systems Interconnection Basic Reference Model [ISO/IEC 7498-1, 1994], an inter-
connection can be interpreted at various levels of abstraction. Descending this
hierarchy, abstract information to be transmitted over a logical (application-
level) link is being represented as binary data; further on, several protocol
layers and routing points can be involved.
At the lowest level are physical interconnection implementations only providing
the raw transport of bit streams. A physical interconnection is any trans-
mission path for waveforms representing binary data, delimited by active
components other than mere signal waveform regenerators. Signal regener-
ators can amplify a waveform (1R regeneration), some additionally restore
digital levels (2R), and some provide retiming on top of that (3R) . Retiming
provides a full discrete bit stream reinterpretation of received signal wave-
forms before they are transmitted again.
In this thesis, we focus on the implementation of physical interconnections
and the transformation of signal waveforms in between locations where a
discrete binary interpretation is performed. For this reason we silently include
3R regenerators as a delimiter of a physical interconnection. Furthermore,
1.3 Electromagnetic radiation as an information carrier 3
from this point on without qualification the term interconnection will refer to a
physical interconnection. In the same way, interconnect is used as the collective
term for physical interconnections.
An interconnection is called parallel when it carries several physical channels—
different signal waveforms that can be distinguished on a spectral or geometric
basis, for instance by using different carrier frequencies or physically parallel
transmission lines or waveguides. Time domain multiplexing does not count
as this is indistinguishable from a bit stream associated with a single logical
connection from a signal waveform perspective.
A physically parallel interconnection can be broken down in simple intercon-
nections: one transmission line or waveguide. The pieces remaining when
a simple interconnection is cut at all signal regenerator positions are called
segments.
1.3 Electromagnetic radiation as an information
carrier
The carrier for all present-day immediate digital communication is electro-
magnetic (EM) radiation due to its vast advantages over any other means of
information transport. It allows communication at the speed of light, and
the motion (or even the existence) of a physical transmission medium is not
required for signals to propagate.
Although the behavior of electric and magnetic fields and their interaction
with matter is always governed by the same Maxwell equations [1865], widely
different approaches are possible to use EM radiation as an information
carrier. The appearance of the implementation and the possible applications
heavily depend on the subset of the EM spectrum (figure 1.2) that is used for
communication, the relative size of the corresponding wavelengths compared
to the segment dimensions, and the choice between a free-space or a guided-
wave approach.
A note on terminology: the term wavelength can refer either to the vacuum
wavelength of EM radiation or to the wavelength inside a certain medium.
When discussing the order of magnitude of an unqualified wavelength com-
pared to something else, either definition will do—the difference between
both is usually less than a factor of 2. Apart from this, the vacuum wavelength
is addressed unless explicitly stated differently.
1.3.1 Free-space approach
In the free-space approach, the propagation of EM waves predominantly
takes place freely inside a vacuum or a homogeneous non-magnetic dielectric
medium (such as air, plastic or glass). Free-space transmission comprises,
4 Introduction
Figure 1.2: Overview of the electromagnetic spectrum. Image courtesy of Keiner
[2007].
1.3 Electromagnetic radiation as an information carrier 5
among others, radio and TV broadcast (though generally not digital), wireless
networking and free-space optical links. The communication is of a broadcast
or point-to-point nature depending on the wavelength, the link density and the
emission pattern. When the wavelength is larger than a fraction of the spacing
between links, or when the emission pattern is not sufficiently focused, the
communication medium is shared across multiple peers. In order to separate
different logical communication channels time domain, frequency domain
or spread-spectrum techniques are required. Examples are mobile phone
communication or wireless networking. Broadcasting is usually not an option
for general interconnect on the PCB or inter-PCB levels because the high
density of such links would divide the available communications bandwidth
and the transmitted power of each channel over too many peers to be practical,
let alone the transceiver complexity implied.
Still, point-to-point links are possible in the free-space approach. An example
is point-to-point microwave communication using two parabolic antennae.
For practical free-space point-to-point links on the PCB or inter-PCB level the
wavelength should be shorter than about 4 µm—short wavelength infrared
radiation—to limit the diffraction of directional beams over longer distances
(this figure corresponds to a Gaussian beam traveling 10 cm with a maximal
waist of 0.5mm). The required coherent emission in this optical spectrum
range calls for laser light sources.
In-system free-space optical interconnect has already been successfully demon-
strated on the System-in-a-Package (SiP, also known as MCM—multi-chip
module) level. For an example we refer to Thienpont et al. [2000]; Debaes et al.
[2003]. However successful on the SiP level, free-space optical links are less
suitable on the PCB level: the required directional accuracy at the origin of
a narrow optical beam becomes unfeasibly strict if the interconnect length
exceeds several 10–s of centimeters [Baukens et al., 1999].
1.3.2 Guided-wave approach
In the guided-wave approach, the propagation of EM waves is forced to follow
a designated path. Again there are a number of options, mainly dependent on
the size of wavelengths relative to the longitudinal size of a segment (along
the direction of the information flow) and its transverse dimensions (in a plane
perpendicular to the information flow).
1.3.2.1 Electrical interconnect
Electrical interconnect is guided-wave interconnect using baseband frequencies
from dc to the microwave range (several 10–s of GHz). Characteristic is the
use of two conductors: a ‘single-ended’ signal path and return path, or a
differential conductor pair. As long as the wavelengths in use are about 100
times larger than the segment length and conductor pair spacing, from the
6 Introduction
(a) coaxial cable (b) twisted-pair cable
signal
ground plane ground plane
signal signal
(c) PCB microstrip (d) differential PCB microstrip
ground plane
signal
ground plane
signal signal
ground plane
ground plane
(e) PCB stripline (f) differential PCB stripline
Figure 1.3: Common manifestations of impedance-controlled electrical interconnect.
The items in the left column are single-ended transmission lines; the right column
shows differential solutions.
1.3 Electromagnetic radiation as an information carrier 7
viewpoint of wave propagation both conductors look like single points. In
this case, a quasi-stationary treatment is applicable. The geometric realization
of these wires thus does not need to take wave effects into account, which
allows a simple implementation. For instance, a 1–GHz electrical link on one
IC connecting a digital gate output to several gate inputs—implemented by
a simple branched metal path and adequate return path provisions—can be
treated as a wire as long as it spans no more than a few millimeter in all
dimensions.
When the wavelength range becomes comparable to or smaller than the
segment length, yet still much larger than its transverse dimensions, wave
propagation effects become significant. In these circumstances, the electrical in-
terconnection is referred to as a transmission line, modeled by the telegrapher’s
equations [Heaviside, 1887]. To avoid unwanted signal reflections, across the
entire segment length the characteristic impedance must be kept constant—the
ratio of the voltage across the conductors over the current through them as
the signal propagates. This necessitates a careful geometrical layout of the
conductors in an environment with calibrated dielectrics (figure 1.3 shows
some examples). On the PCB level, segments spanning over 10 cm need to be
treated as transmission lines starting from about 30MHz. Major PCB-level
implementations are stripline [Barrett and Barnes, 1951] and microstrip [Grieg
and Engelmann, 1952], where a PCB track is suspended above a ground
plane or sandwiched between two ground planes, respectively. For inter-
PCB interconnections from the MHz range up and for longer links, electrical
transmission line cables are used with a twisted-pair [Bell, 1881] or coaxial
[Espenschied and Affel, 1931] geometry.
1.3.2.2 Waveguides
Communication conduits guiding EM radiation with wavelengths comparable
to or smaller than their transverse dimensions are called waveguides. Waveg-
uides are useful for communication in the microwave and optical spectrum
ranges: other, longer wavelengths would require at least decimeter-range
waveguide widths, which is impractically large on the scale of contemporary
electronic systems.
Waveguides have a non-magnetic dielectric (typically air, plastic, glass) or
semiconductive interior, which is transparent to the range of wavelengths that
should be propagated. This interior medium is surrounded by a conductive
material, a different dielectric material, or even a periodic structure, which
functions as a mirror for certain manifestations of EM radiation (called modes),
effectively confining their propagation to the path laid out by the waveguide.
Digital communication through the waveguide is then possible through the
modulation of one or more modes.
Although two-conductor electrical transmission lines behave as a multimode
waveguide as well at very short wavelengths, we strictly refer to the above
8 Introduction
Figure 1.4: Rendering of a typical hollow metal waveguide
corecladding protection
250 µm
Figure 1.5: Cross-section of an 8-wide fiber ribbon with clearly visible core and
cladding (photo by Nexans).
definition when the term ‘waveguide’ is used.
Waveguides used at microwave wavelengths are typically implemented as
hollow metal rectangular or cylindrical pipes (see figure 1.4) [Lord Rayleigh,
1897]. They can be used at the PCB or rack level for selected very high
bandwidth connections. However, the signal attenuation is sizeable due to
resistive losses in the metallic mirrors. The millimeter-range wavelengths
require corresponding transverse waveguide dimensions, precluding very
dense interconnect or a PCB-integrated approach. The (semi-)rigidity of
the metallic components also makes for a complicated physical waveguide
handling process.
1.3.2.3 Optical interconnect
Waveguides suitable for infrared or shorter wavelengths are called optical
waveguides. Confinement of the EM radiation to the waveguide interior is
classically implemented using the principle of total internal reflection (TIR). To
this end, in a step-index waveguide, the interior waveguide medium (the core) is
surrounded by another optical medium of lower optical density (the cladding;
see figure 1.5). In a graded-index optical waveguide, this high-to-low transition
of the optical density—required for TIR—takes place gradually inside the core
(as illustrated in figure 1.6). TIR is not the only possible radiation confinement
method: photonic crystals—optical wavelength-scale periodic microstructures
1.3 Electromagnetic radiation as an information carrier 9
(a) step index fiber
(b) graded-index fiber
Figure 1.6: Meridional ray trajectories in a step index fiber and a graded index fiber.
In the graded-index fiber, the lower optical density at the outside of the core results
in the ray traveling faster (indicated in red) than at the center of the core. This effect
equalizes the time of flight across all possible ray trajectories.
inhibiting the propagation and the absorption of certain wavelengths—can
also be used for this purpose [Yablonovitch, 1987; John, 1987].
Optical waveguides are mass produced in the form of glass or plastic cylindri-
cal optical fibers [Keck et al., 1973; Miller et al., 1973]. In experimental PCB- or
IC-integrated approaches, they are realized with a rectangular or trapezoidal
cross-section. More exotic topologies appear when photonic crystals are used.
Signal propagation in an optical waveguide is typically driven by modulated
near-monochromatic light from a laser or light-emitting diode (LED) being
coupled into the waveguide. The use of a single wavelength prevents the
spreading out of the signal as it propagates due to wavelength-dependent
propagation speeds (chromatic dispersion). This effect would limit the attainable
bandwidth-distance product. Several physical channels—each emitting at a
different wavelength—may be multiplexed over the same optical interconnect
segment to increase the total throughput (this is called wavelength division
multiplexing or WDM) [DeLange, 1970].
When the transverse dimensions of the optical waveguide interior are suf-
ficiently larger than the wavelength, several different EM wave geometries
can satisfy the conditions for propagation imposed by the waveguide—the
modes we mentioned earlier. In such multimode waveguides, different modes
propagate at different speeds. The coupling of light into a waveguide gen-
erally excites several modes, and EM radiation can switch between modes
during propagation as well—an effect called mode mixing—due to material
impurities and geometrical irregularities (e.g., waveguide bends). Again, the
resulting spreading out of the modulated signal as it propagates at varying
speeds—multimode dispersion—limits the attainable bandwidth-distance prod-
uct. This is mitigated by the use of single-mode waveguides, with sufficiently
small transverse dimensions to guide only one mode. Alternatively, inside
graded-index waveguides the transverse optical density arrangement maximally
equalizes the propagation speeds of different modes.
10 Introduction
1.4 Electrical versus optical interconnect
1.4.1 Interconnect demands and typical realizations
In this section, the motivation for the common use of optical links in telecom-
munication systems and electrical links inside computational systems is pre-
sented. The length of the interconnections is the decisive difference. For
telecommunication, the locations of the communication endpoints are non-
negotiable primary design inputs. Conversely, inside computational systems,
precisely the physical location of different elements is optimized towards inter-
connect efficiency. We distinguish interconnect requirements and realizations
starting from this perspective in the following discussion.
1.4.1.1 Telecommunication
Requirements The required bandwidth of a (logical) telecommunication
connection strongly depends on the application: telephone speech takes
up 64 kbps compared to about 20Mbps for high definition video. Physical
interconnection bandwidth needs are highest—up to the Tbps range—at long-
distance physical connections in network backbones as very many logical
connections are multiplexed over them.
Telecommunication applications have to be robust with respect to an in-
escapable latency. The combination of the distance between the communi-
cation endpoints and the speed of light establishes a fundamental minimal
time-of-flight latency between a microsecond and several milliseconds, often
many times the signaling period. In the presence of a sizeable effective time-
of-flight latency across a logical connection, comparatively small additional
delays in intermediate amplification and routing provisions can be tolerated
as well.
Realization The eventually realized bandwidth of long-distance segments
depends on cost figures rather than on physical limitations: if free-space (radio,
microwave, optical) links—which have no medium cost—are insufficient,
guided-wave links can be deployed in as many cables in parallel as deemed
necessary.
Practical guided-wave solutions are electrical transmission lines and optical
waveguides. For new multi-km interconnections, optical fibers have super-
seded electrical solutions beginning in the 1980s due to the associated vast
power and bandwidth advantages, allowing for long distances between subse-
quent signal amplifiers. As for signal power losses, the attenuation of optical
fiber is 0.7 dB/km for contemporary multimode fiber and 0.2–0.3 dB/km for
single-mode fiber [Corning, 2007]. This attenuation is essentially independent
of the bandwidth of the modulation of light (yet it is wavelength dependent).
1.4 Electrical versus optical interconnect 11
By contrast, low-loss coaxial cable exhibits losses already around 10 dB/km at
1MHz rising to 140 dB/km at 1GHz due to the imperfection of even the best
conductors and dielectrics [Belden, 2006].
The maximal bandwidth of a single physical channel in an optical fiber is
dispersion limited. Numerical bandwidth-distance product values are around
5–50MHz · km for step-index multimode fiber, 0.5–5GHz · km for graded-
index multimode fiber, and 100–1000GHz · km for single-mode fiber.
Over several decades, the bandwidth demands of telecommunication appli-
cations between a single pair of endpoints have increased vastly; present
bandwidth-hungry applications are broadband internetting and video on
demand. The introduction of ever smaller and cheaper semiconductor lasers
and LEDs has made optical interconnections possible at increasingly shorter
distances. For instance, in an ordinary present office working environment,
optical links for (multi-)gigabit Ethernet are widespread, spanning distances
as short as a few meter.
1.4.1.2 Computational systems
Requirements When an algorithm is implemented in hardware, the time it
takes just to carry out the necessary calculations and memory lookups by the
hardware provisions establishes a lower bound on its duration. This duration
lengthens whenever a step in the critical path is delayed due to both the
hardware component involved being idle and the input data being available,
but not yet at the right location. Obviously, if the speedy completion of the
algorithm is a goal, significant interconnect delays in the critical path should
be minimized.
Both the bandwidth and the latency requirements of interconnections can be
high here. The available bandwidth of an interconnection between hardware
components is too low when it becomes the limiting factor for the desired
throughput between these components. An example of interconnect stressed in
such a way is between the processor and the video card in a typical personal
computer during graphics-intensive applications (e.g., games), which has
necessitated very high-bandwidth interconnect solutions with a throughput
up to 64Gbps at present [PCI Express, 2004].
The interconnect latency can become a bottleneck as well wherever feedback
occurs in the information flow—a hardware component requiring data from a
distant second component, in its turn dependent on earlier data from the first.
This bottleneck becomes evident when the remote processing time is short
compared to the time it takes for information to travel there and back.
Whereas bandwidth and latency figures are mostly disconnected in telecom-
munication interconnections, both become coupled for computational-system
links where the latency is the most critical. The relation between the actual
latency L, the time of flight T (including the intrinsic latency of signal regen-
12 Introduction
erators), the realized bandwidth B and the amount of information H to be
transmitted in one bundle is as follows:
L = T +
H
B
(1.1)
In this equation, either term can dominate. When T dominates, the link should
be made shorter to reduce the time of flight. Otherwise, one can increase B
beyond the average throughput to reduce the latency.
A notable example where H/B is dominant comes from the world of high-
performance computing (HPC) and is evident as well in fast server hardware.
Symmetric multiprocessing (SMP) systems contain small but fast cache mem-
ories on each processor die to speed up memory requests in the presence of a
high locality of reference. This type of latency optimization is not primarily
interconnect-related, but results from the comparatively longer lookup times
in large dynamic memories. When one processor now reads from the same
memory location that another processor just wrote to, data needs to travel from
the other processor’s cache to the processor performing the read operation.
The intrinsic cache lookup time—presently about 5–8 ns for level 2 caches—is
significantly exceeded by the interconnection overhead—about 130–260 ns for
recent Intel Xeon and AMD Opteron SMPs [AnandTech, 2006], likely holding
up execution at the processor at the reading end. We remark that this latency
is indeed governed by bandwidth limitations rather than the time-of-flight,
in the way that we have just explained: 130 ns per se would suffice for an
information roundtrip over more than 10m; the different processors are orders
of magnitude closer together.
We want to emphasize that this bandwidth/latency example is not limited to
SMP systems in a strict sense. Given enough processors, limitations of shared
memory bandwidth enforces the use of processor-local memory approaches,
where interprocessor communication is taken over by an interconnection
network of which low-latency links no longer constitute a fully connected
topology [Gupta et al., 1990].
The nature of some computational problems enforces algorithm implementa-
tions with a higher degree of interconnectivity between processor nodes than
can be provided by low-latency links of the underlying physical interconnec-
tion network. In this case, latency-induced performance limitations inevitably
arise as data is forced to travel through higher latency paths (e.g., through
multiple network hops) [Collet et al., 2000].
Realization As it is the native interface of present logic and memory primi-
tives, interconnect inside computational systems is normally electrical. In a
quick enumeration this comprises on-chip metal tracks; chip packages with
bond wires, embedded metal connections and leads, pads or solder balls;
intra-PCB short or slower general tracks and longer/faster (single or differ-
1.4 Electrical versus optical interconnect 13
Figure 1.7: Dense electrical routing on the top layer of a common PC motherboard
ential) microstrips and striplines; and inter-PCB slow flat cables and faster
twisted-pair and coaxial cables.
In contrast, optical interconnect always requires rather complex optical-
electrical (O-E) conversion provisions (light sources/modulators and pho-
todetectors) at the optical path endpoints, leading to a result with more
components and additional manufacturing steps. The signal conversion also
adds to the intrinsic latency T, thus rendering optical links unrewarding when
T dominates in equation 1.1 [Neefs, 2000a].
The cost arguments of long telecommunication segments mentioned earlier are
not valid here either. With specific regard to medium and repeater costs, appar-
ently any reasonable bandwidth can be realized with many physically parallel
electrical lines without an overwhelming cost penalty as the interconnection
groun
 plane
d
gna
ire
si
l
 w
i e
g
 =
 l
w r
 len th
A
inter-wire spacing area
Figure 1.8: Illustration of parameters associated with the electrical bandwidth density
limit B ∼ B0A/l2 from Miller and Özaktaş [1997].
14 Introduction
distances are fairly short.
However, computational systems are implemented as compact as possible to
reduce the time of flight. The limited cross-sectional area available to each
interconnection thus restricts the attainable electrical interconnect parallelism.
1.4.2 Bandwidth limitations of closely packed electrical
links
1.4.2.1 Fundamental bandwidth density limitations
The maximal bandwidth achievable over an electrical segment is limited by
presently inevitable conductor and dielectric material imperfections. Miller
and Özaktaş [1997] have derived a theoretical limit on the information band-
width attainable over a non-repeatered electrical interconnection of a given
length and available cross-sectional area, even when the physical parallelism
realized within this area is optimized. The limit is of the form B ∼ B0A/l2,
where B is the bandwidth, A the cross-sectional area, l the interconnection
length and B0 a factor in the range 106–109 Gbps (dependent on the feasibility
of certain materials, topologies and features at different scales). The physical
density of different interconnections thus limits the attainable throughput.
This (presently still large) theoretical limit is further reduced by the dissipa-
tion in real-world dielectrics [Berger et al., 2003] and the extra spacing or
shielding required to limit crosstalk caused by electromagnetic interference
(EMI) between adjoining segments [Hall et al., 2000; Tummala, 2001]. Practical
considerations eventually determine the realized bandwidth density, e.g., the
realization of interconnections within layered structures like ICs and PCBs
with an upper feasibility limit on the number of interconnection layers.
1.4.2.2 Rising bandwidth density needs
The processing power of single ICs, the core of digital systems, follows a
continued rapidly rising course. The 1965 empirical prediction called Moore’s
law [Moore, 1965]—stating that the most cost-efficient number of transistors
per IC doubles every 2 years—has been upheld until now, and is predicted to
hold for the foreseeable future by the International Technology Roadmap for
Semiconductors [ITRS Overview, 2006]. With the current 65–nm complemen-
tary metal-oxide–semiconductor (CMOS) fabrication technology, the number
of transistors on a single die reaches several 100s of millions.
The boost of the number of transistors per chip is driven by ever-smaller
transistor fabrication technologies and to a lesser extent by an increase of the
chip size, which has doubled ‘only’ every six years [ITRS Overview, 2006]. In
addition, as transistors become smaller they are able to switch faster. Hence
not only the absolute amount, but also the density of IC processing power rises
quickly.
1.4 Electrical versus optical interconnect 15
Some applications drive interconnect demands to extremes when attempting
to fully exploit this massive processing potential. In these cases, established
electrical interconnection approaches fail to supply the bandwidth density
required to take significant advantage of further CMOS technology improve-
ments. The multiprocessing discussion above gives an example of this problem
arising in a computational system. Telecommunication systems can be af-
fected as well: in routing hardware, many high-bandwidth links are brought
together in a confined space. In routers at the Internet backbone (e.g., Juniper
Networks TX Matrix Platform [2004]), the huge bandwidth densities render
an all-electrical interconnect approach problematical.
1.4.2.3 Extending the reach of established approaches
At multiple levels in the physical interconnect hierarchy, diverse technological
or methodological techniques are deployed to extend the bandwidth density
of established electrical interconnection approaches. We will briefly cover such
techniques now.
On-chip On-chip electrical interconnect technology is performing a best
effort to keep pace with transistor improvements, compelling the use of up
to 11 interconnect layers for 65–nm technology, better-conducting copper
instead of aluminum and low-κ dielectrics, reducing parasitic capacitance
[ITRS Interconnect, 2006]. Signaling is typically synchronous with the on-chip
clock, with frequencies reaching to several GHz over interconnects ranging
from single wires to on-chip buses 100s of bits wide [Benner et al., 2005].
Intra-package Compared to general printed circuit board (PCB) facilities,
higher inter-IC bandwidth density as well as lower latency figures can be
achieved by combining the IC dice in the same package. The smaller size
of such Systems-in-a-Package (SiPs) enables higher-resolution manufacturing
techniques and eases the exploitation of the vertical dimension, both resulting
in a far higher link density than general PCBs. The shorter segment lengths
increase the attainable bandwidth density as well [Miller and Özaktaş, 1997].
SiPs strongly interconnect very large dice, dice of different fabrication tech-
nologies and passive components, in a horizontal, vertical or embedded
fashion [ITRS Packaging, 2006]. Realizations driven by bandwidth density
improvement comprise multi-core processor realizations and processor-cache
combinations. For example, the combined packaging of two dual-core dies
in recent Intel Xeon quad-core packages has reduced cache lookup latency
with 30% for different-die same-package lookups when compared to different-
package lookups [AnandTech, 2006].
16 Introduction
On a PCB or backplane-connected system
Transmission lines At the PCB level, the processing accuracy is being im-
proved along with material and maximal layer count enhancements. The
high processing accuracy enables the realization of high-grade differential
striplines, which can be driven at multi-GHz frequencies using low-voltage
differential signaling (LVDS) [2001] and related standards.
Ordinary tracks used as quasi-stationary wires are bandwidth limited by their
length. To improve the bandwidth density, there is a present trend to replace
such tracks by true transmission lines. This changeover alleviates throughput
limitations of pin-limited IC packages as well.
The realization of multipoint-driven buses involving more than two parties
over true transmission lines brings on impeding signal timing and line ter-
mination complications. For this reason, a transition to point-to-point star
topologies is observed, witness the changeover in present PCs of the physically
multipoint-driven PCI bus to the star topology of PCI Express. Here, a mul-
tiple of the highest cost-effective bandwidth achievable over one differential
stripline is realized using parallel lines: in the present standard, a total of 32
parallel lines at 2.5Gbps are supported, yielding a maximal point-to-point
bandwidth of 64Gbps (taking into account a 25% data coding overhead)
[PCI Express, 2004]. Other high-bandwidth low-latency standards like Hy-
perTransport [2001], Parallel RapidIO [1999] and InfiniBand [2000] are of a
similar nature: all of them employ physical point-to-point links over parallel
transmission lines with comparable low-level signaling approaches.
In the rest of this text physically multipoint-driven buses are no further
considered. To avoid confusion, we should mention that multidrop (single-
driver) signal distribution with a branched transmission line path remains
possible and has its applications (e.g., gigabit multidrop serial backplanes
[Esper-Chain et al., 2005]). Besides an increase of transmitted power and
specific implementation aspects of the line branching, multidrop and point-
to-point signaling mechanisms are very similar (and this holds in the optical
domain as well). Implementation aspects of branches are beyond the scope of
this thesis.
Equalization The maximal information bandwidth over a single differential
stripline has increased as well by an improved conditioning of the analog sig-
nal. Higher signal frequencies are selectively amplified to precisely counteract
the frequency-dependent signal attenuation throughout the stripline caused by
the skin effect and dissipative dielectrics. The net result is an attenuated, but
not distorted, signal at the receiver. This signal conditioning is provided at the
segment endpoints by dedicated preemphasis or equalization circuits. Although
not new—developed in the 1920–1930’s for long electrical telecommunication
links [Zobel, 1926; Mayer, 1936; Bode, 1937]—only relatively recently this
1.4 Electrical versus optical interconnect 17
Figure 1.9: Ball grid array (BGA) package with a cavity suitable for wire bonding.
technique is being applied to short digital links [Dally and Poulton, 1997]. In a
present state-of-the-art commercial example, equalization permits a 10.7–Gbps
transmission over a single 75–cm skin effect limited stripline using copper and
standard FR-4 PCB dielectrics [Maxim MAX3805, 2006]. This corresponds to a
path loss of 20 dB at 5GHz: such an extreme line bandwidth exploitation is
consequently coupled with severe power losses.
Chip access/escape perimeter PCB-level bandwidth density problems are
the most critical very close to ICs with demanding throughput requirements.
Strictly spoken, the bandwidth density of a multi-centimeter interconnection
can always be increased by adding—at a cost—extra signal regenerators
in between, thus decreasing segment lengths. Close to a demanding IC
however, some thousand lines converge. The line count and regenerator
size effectively limit the proximity of any signal regenerators to the IC die.
The final millimeters of the interconnection—thus unsupported by active
components—bring on a complex escape perimeter problem stressing PCB
manufacturing, chip packaging as well as assembly methods.
On a PCB, striplines are normally nicely interspaced and spread over multiple
PCB layers. When very many lines are approaching a common IC endpoint,
they all have to emerge and share the PCB surface underneath the IC package,
yet with pads large enough for a reliable package mounting. Manufacturing
techniques for blind/buried vias and microvias allow lines to come to the
surface without perforating the whole PCB or taking up too much area. The
package itself is in charge of the remainder of the downconversion, from the
PCB pad pattern to the IC-scale input/output (I/O) link density.
For bandwidth-critical applications a transition has been made over the last
18 Introduction
two decades from electrical I/O at the perimeter to uniformly spread surface-
based I/O, both concerning the package–PCB interface and the IC–package
interface. Edge-leaded packages are being superseded by pin or ball grid
array (BGA) packages (figure 1.9); flip-chip bonding is chosen over wirebound
ICs.
Using present manufacturing technologies, BGA packages are feasible with
up to about 4000 balls on a 0.65mm pitch, signaling at 5Gbps per differ-
ential pair [ITRS Packaging, 2006]. Further density and size improvements
bring on issues resulting from line impedance changes, crosstalk caused by
electromagnetic interference, current density limitations in the power distribu-
tion, and mechanical strain caused by different thermal expansion coefficients
[Blackshear et al., 2005].
Backplane and cable connectors In rack-based systems, a critical point is
the bandwidth density of card edge connectors, as it is hard to maintain a
high bandwidth density over a complex interface change. High-performance
card edge connectors for backplane- or cable-based interconnect increasingly
abolish single-ended designs in favor of a pure differential approach: in
this way, transmission lines can be brought off-board without jumps of the
characteristic impedance. Present cutting-edge connectors (e.g., FCI Airmax
VS [2003]) support bandwidth densities up to 20Gbps per millimeter card
edge for a card-to-card pitch of 25mm.
1.4.3 Optical interconnect: a superior bandwidth density al-
ternative
1.4.3.1 Justiable deployment at ever shrinking distances
Short-range optical interconnect has been repeatedly proposed over the last
two decades, as it provides a superior alternative to electrical interconnect
in situations with huge bandwidth density demands [Goodman et al., 1984;
Miller, 2000]. An optical transmission path possesses several desirable proper-
ties, some of which we have already mentioned in section 1.4.1.1. Attenuation
is low (in some cases a fractional dB/km figure is very realistic) and by far
steadier with respect to the EM wave frequency than electrical links: it con-
ceivably remains within the same order of magnitude across several 10s of
THz [Corning, 2007]. Considering the presently used GHz-range modula-
tion of near-monochromatic light, the attenuation is essentially modulation
bandwidth independent. Yet higher modulation rates will therefore not neces-
sitate improved physical optical path characteristics and the ensuing hardware
renewal costs (for instance, optical backplanes can be reused). Combined
with the small cross-section—a typical fiber diameter of 125 µm (determined
by the ease of manipulation; in integrated approaches few-µm waveguide
pitches are feasible as well)—very high bandwidth densities can be achieved.
1.4 Electrical versus optical interconnect 19
Figure 1.10: High density optical connector providing support for 64 fibers in the
indicated 4mm2 area, placed on a piece of € 1 for comparison.
Furthermore, the absence of conductors avoids the interaction of free electrons
with the EM field and all associated EMI problems, removing the need for
spacing of shielding in between parallel segments. Figure 1.10 illustrates the
small pitch attainable this way.
Although the bandwidth density of electrical interconnect could be increased
through the spending of even more power, there are situations in which the
associated costs and heat removal issues no longer compensate for the added
complexity of optical link integration. As the previous sections have made
fiber bundle
pac
agek
CMOS
 s bstrate ( op s e visible)
u
 t id
i e sem tt r d e ret cto s
PCB
c
n
to
on
ec
r
solder balls
Figure 1.11: Artist’s impression of a possible OE-VLSI approach
20 Introduction
clear, for a given cross-section and a constant power budget the achievable
bandwidth strongly decreases with distance in an electrical approach, whereas
it remains essentially constant (on a system scale) when using an optical
transmission path. This gives rise to a break-even distance beyond which
optical bandwidth density outperforms electrical bandwidth density for a
similar power budget, given the state of the art in both approaches.
The introduction of ever faster, smaller and cheaper O-E components—such as
two-dimensional Vertical-Cavity Surface-Emitting Laser (VCSEL) arrays—has
enabled a very tight integration of electronics and optics, and has strongly
reduced this break-even distance to the centimeter range at present [Naeemi
et al., 2004; Cho et al., 2004a; Uhlig and Robertsson, 2006].
Fiber-based rack-to-rack and board-to-board solutions are now commercially
available, and a number of board-to-board and chip-to-chip free-space, fiber-
based or waveguide-based prototypes have been demonstrated [Neefs, 2000b;
Ishii et al., 2003; Bockstaele et al., 2004a; Cho et al., 2004b; Yoon et al., 2004;
Rho et al., 2004; Cho et al., 2005; Schares et al., 2006]. Even on-chip optical
interconnect is actively being investigated [PICMOS, 2006; Roelkens et al.,
2006].
1.4.3.2 OE-VLSI: tight optical interconnect integration
When bandwidth density issues justify the deployment of optical interconnect
between different ICs, the question remains as to how close to implement
the required O-E conversion to both link endpoints. In present commercial
such systems (e.g., using SNAP12 [2002] modules), the O-E conversion oc-
curs in separate packages on the PCB, located at the PCB edge or next to
the bandwidth demanding processing IC(s). Part of the link—between the
processing IC(s) and the O-E module—is thus still electrical in nature, and
the bandwidth density issues concerning the PCB-to-IC access—described in
section 1.4.2.3—continue to hold.
By integrating the processing IC and O-E provisions into the same package,
this extra PCB link is avoided. The tightest form of integration—and the central
perspective in this thesis—is the direct provisioning of the O-E conversion
where the data is needed or generated: on the IC die. Such an approach avoids
the ever aggravating known bandwidth density issues [ITRS Packaging, 2006]
concerning PCB manufacturing, chip packaging and assembly all at once.
The on-chip optical access is particularly interesting when the O-E conversion
is area-based as opposed to linear: the data rate can then reach several
hundreds of gigabits per second while still using GHz-range channels. This
is an improvement similar to the transition from perimeter-based to surface-
based electrical connectivity of ICs and packages (see section 1.4.2.3, paragraph
chip access). Such a surface-normal optical interconnection approach has been
called optoelectronic very large scale integration (OE-VLSI) in [Krishnamoorthy
and Goossen, 1998]; this is the term we will use throughout this document.
1.5 This thesis 21
The practical approach towards an OE-VLSI system is the subject of the next
chapter. For the smoothness of the presentation we already mention some
basics of the system under review here. As will be discussed there, emitting
light from CMOS substrates is hard; it is much easier to use wafers of III-V
compound materials for this purpose (this name refers to the columns of
the periodic table of elements involved). The approach taken is one where
optical emitters and detectors are realized in separate III-V wafers, organized
in two-dimensional arrays. These arrays are then integrated with CMOS using
flip-chip bonding. The surface-normal optical emission is coupled into some
kind of optical pathway, which guides the light to one or more detectors. An
artist’s impression of such a system is rendered in figure 1.11.
1.5 This thesis
Although the direct integration of O-E conversion on an IC bypasses a very
complex electrical connection, the associated intricacies of this connection have
been well studied and characterized, and a number of established solution
methodologies are available. By contrast, all complexity associated with
surface-level O-E conversion—involving the heterogeneous integration of III-V
devices—and the integration of an optical path at the package and PCB level
has not nearly as well been explored nor convincingly resolved, although a
number of successful experimental approaches are known (for this we refer to
the next chapter).
The complexity of just the physical assembly of a highly parallel OE-VLSI
manifestation is impressive, as a number of cutting-edge techniques are
required just to make things work.
This dissertation focuses on another kind of complexity—that which is pre-
sented to a system designer considering an OE-VLSI approach. The main
motivation of this work is to facilitate the integration of an OE-VLSI setup
in a digital system, assessing methodological and operational aspects where
possible.
At the creation of a new digital system, several questions need to be addressed.
To begin with, in order to even bring OE-VLSI into the picture, it is vital to
know what one can do with the parallel optical link provided. What about
the signal timing and integrity? How much skewed are signals of adjoining
links at the link exit, if they were synchronous at the link entrance? Should
the optically signaled data be dc-balanced, as it is in telecommunication
applications? What is the attainable bandwidth? How much power to spend,
and where does it go?
Answers to questions like these determine the attractiveness of the system,
and are key to the feasibility of—or need for—different approaches concerning
the encoding of the data and the signal timing approach taken. The answers
can only be supplied by a thorough (statistical) modeling and characterizing
22 Introduction
approach, revealing many operational system aspects. This is the main contri-
bution of this work, presented in chapter 4. We show that source-synchronous
signaling over parallel optical interconnect with concurrent sampling at the
receiving end is possible in carefully designed systems, yet that dc-unbalanced
signaling proves to be impracticable.
Electronic design automation (EDA) tool support is important as well for a
system designer, assisting in the design creation (instantiating and dimen-
sioning different elements of the optical interconnect in a digital system),
design evaluation/verification (simulation of the interconnect together with
the rest of the system) and even design space exploration (to attain an optimal
trade-off between different qualities and costs). This subject is touched upon
in chapter 3 with a methodological discussion and circuit-level simulation
models.
An important operational aspect is the possible interference between analog
OE-VLSI–related circuits on the CMOS—especially receiver circuits—and
directly adjoining digital circuits through the CMOS substrate. We have
been in the position to test a measurement and evaluation approach for such
disturbances; this effort is treated in chapter 5. We could confirm that a guard
ring protection of receiver circuits suffices to absorb local substrate disruptions,
thus limiting the problem of substrate noise coupling to global supply and
ground bounce, which has to be taken into account anyway (yet admittedly
occurring with higher amplitude in an OE-VLSI setup).
This thesis has been performed in the context of the Interconnect by Optics
(IO) project of the European Commission’s Fifth Framework Programme,
which ran from the end of 2001 to early 2005. The main project objective
was to research the industrialization of OE-VLSI systems. All experimental
characterization reported upon in this work concerns hardware prototypes
developed within the IO project. This often involves the principal test chip
of the IO project in 0.35 µm CMOS, going by the name of Digital Technology
Assessment (DTA) IC. The composition of this IC and other relevant hardware
is presented in appendix A.
1.6 Overview of achievements
This section gives an overview of the research work performed for this thesis.
All discussed research items exhibit a major personal contribution; related joint
efforts are acknowledged. Although the most prominent prior publications are
indicated here as well, we refer to the main text for a more detailed discussion.
1.6.1 Circuit-level OE-VLSI component modeling
Our earliest research work addresses the construction of simulation mod-
els for the different components constituting an OE-VLSI interconnect
1.6 Overview of achievements 23
implementation—O-E conversion hardware, interfacing circuits and the
optical path—and their implementation in the Verilog-AMS [1998] mixed-
signal hardware description language. The associated objective was to permit
an all-embracing OE-VLSI simulation integrated in an existing electronic
design automation (EDA) framework (which concretized to Cadence Virtuoso
[1987]).
On the one hand, this could be used to predict the behavior of the OE-VLSI
approach after all components had been characterized. On the other hand this
would be a step towards the joint tuning of all OE-VLSI components for an
optimal power, performance and reliability trade-off from an across-the-board
perspective, thus transcending the optimization of OE-VLSI components in
isolation.
We have implemented operational simulation models and achieved a
thorough understanding of the intricacies concerning the integration of
multidisciplinary—electrical, optical, thermal—and sometimes badly condi-
tioned differential equations—e.g., laser rate equations—into a simulation
system bearing the marks of an electrical circuit dedication [De Wilde et al.,
2003b, 2004].
The characterization and tuning work was postponed pending a yet further
progression of the IO project regarding O-E components development and
measurability. Two Master’s theses [De Clerck, 2004; Bhatti, 2006] have tackled
these postponed subjects at a later time, building on the established modeling
research.
In a later project on intra-chip interconnect [PICMOS, 2006], for the modeling
and characterization of µm-sized distributed Bragg reflector (DBR) lasers—
research work of Ph.D. student Joris Van Campenhout—much of the acquired
knowledge could be recycled in a Verilog-AMS model implementation effort
which proved useful in a larger framework on the systematic simulation-based
predictive synthesis of integrated optical interconnect [O’Connor et al., 2007].
The circuit-level modeling research is treated in chapter 3.
Acknowledgements This research effort was performed in close cooperation
with fellow Ph.D. student Olivier Rits, with input from several IO project
partners on the OE-VLSI component of their specialty.
The scientific contribution of this initial research work is limited to method-
ological issues; it concerns model integration rather than the development of
the model equations themselves; references where due are given in chapter 3.
1.6.2 Statistical modeling and characterization of an OE-VLSI
system
We have thoroughly examined the connection between the many efficiency and
accuracy aspects of a sophisticated OE-VLSI assembly and resulting across-
24 Introduction
the-board optical interconnect characteristics such as overall signal attenuation
and offset, latency, jitter, noise and ensuing bit error ratio (BER) figures.
A modeling and characterization approach of a strong statistical nature was
adopted. We have developed simple yet adequate stochastic models to cap-
ture the behavior and uniformity of the different subsystems of the optical
interconnect—driver circuit, VCSEL, optical path, photodiode and receiver
circuit. These models specifically address amplitude, timing and noise behav-
ior. The methodology is generally applicable; for the purpose of quantitative
analyses our models have been characterized based on detailed and extensive
measurements of the actual performance on OE-VLSI hardware from the IO
project (the converter boards presented in appendix A). In order to capture as
much information as possible, this measurement approach has been one of
dividing the parallel optical link in as many parts as possible—by cutting in
between driver and laser arrays, in front of and behind the fiber assembly, and
in between photodiode and receiver arrays.
Major attention is given to the statistical modeling of the butt coupling effi-
ciency between a VCSEL array and a multi-fiber connector. To this end, a 3-axis
positioning setup for a controllable alignment of an optical fiber alignment
and an OE-VLSI module has been realized. This system was autonomously
able to accurately scan the alignment-dependent coupled power between all
devices of an 8×8 VCSEL (or photodiode) array and a fiber.
We have used the acquired measurements to characterize a simple ray tracing
model for the expected coupling efficiency on an 8×8 VCSEL array. A stochas-
tic model (taking process variations into account) for the alignment-conditional
power coupling between a VCSEL and a fiber has been characterized as well.
When a fiber bundle is connectorized, the positions of the fiber facets are
mutually fixed relative to each other as well as relative to the alignment
features of the connector. When the connector is plugged into a package,
the misalignment of laser-fiber pairs at different positions of an O-E array is
therefore strongly correlated. We have combined our alignment-conditional
VCSEL-fiber coupling characterization with a model of the array-wide align-
ment of a connectorized fiber bundle to an O-E array. The latter model was
characterized for OE-VLSI packages and connectors of the IO project (courtesy
of fellow PhD student Olivier Rits; see also [Rits et al., 2004]).
On the IO project hardware, it turns out that both the inter-VCSEL variability
and the fiber bundle alignment uncertainty are significant. Effects with array-
wide correlation are produced by significant global translational misalignment,
and the smaller impact of global rotational misalignment turns out to be
dwarfed by the VCSEL process variations. In conclusion, we have derived a
practicable stochastic model for the array-wide laser-fiber coupling and have
characterized its distributions.
By bringing all characterized component models together, we have achieved
the quantification of all-inclusive signal attenuation and offset, latency devia-
tion, jitter, noise and ensuing bit error ratio (BER) figures, not only including
1.6 Overview of achievements 25
the statistical evaluation of isolated optical channels, but also capturing the
statistical dependencies between different channels juxtapositioned within the
same array or package [De Wilde et al., 2007].
The results obtained have been directly put to work to evaluate the option
of using true source-synchronous signaling over optical interconnect with a
high physical parallelism, reducing the substantial per-channel clock synchro-
nization circuitry to one instance. We have also looked into dc-unbalanced
signaling to remove the need for data coding.
For the IO project hardware, the usage of a common logic threshold across
all channels, required for dc-unbalanced signaling, appears infeasible after all
models are combined. Efficient true source-synchronous signaling turns out
to be in reach in carefully designed systems.
The statistical modeling research is presented in chapter 4.
Acknowledgements The alignment-conditional measurements have been
performed in close cooperation with fellow PhD student Olivier Rits, whose
packaging research is kindly acknowledged [Rits et al., 2004]. He has con-
tributed in the ray tracing department as well. All remaining component
modeling work, the array-wide statistical power model and the global inter-
connect parallelism evaluation are personal efforts with support from my
advisor as to the correct application of statistics and a comprehensive notation
solution.
1.6.3 Substrate noise
In an OE-VLSI approach, optical receiver circuits are integrated in a chip
alongside each other and rapidly switching digital circuits. In this situation,
coupling of digital switching noise through the substrate can compromise the
accurate operation of the receiver circuits. Besides its relevance for OE-VLSI
systems, substrate noise is an important issue in all mixed-signal designs
where sensitive analog circuits are embedded in a hostile digital environment.
We have researched the intrusion of localized CMOS substrate disturbances to
nearby positions. To this end, we have implemented customizable substrate
noise generation circuits and a noise measurement circuit next to sensitive
receiver circuits of the IO project in 0.18–µm CMOS (figure A.6 on page 177).
We have assessed whether locally injected substrate noise could influence the
operation of the adjoining receiver circuits.
We have achieved a large substrate noise measurement bandwidth, which
can be explained from the detailed circuit behavior [De Wilde et al., 2006a,b].
The accurate wave tracing of pulses as narrow as 200ps has been demon-
strated. Furthermore, we have devised an approach to mitigate measurement
inaccuracies resulting from a noisy voltage reference or sampling time jitter.
26 Introduction
When one tries to simulate noise coupling effects in detail, the computational
overhead of existing substrate noise simulation tools (e.g., Cadence Substrate
Noise Analyst [2004]) is tremendous for all but the smallest designs. We have
performed such simulations, necessarily simplified to a certain degree, and a
good agreement between measurement and simulation was obtained.
The receiver circuits turned out to be adequately shielded against localized
substrate disturbances, yet we were able to identify one remaining disruptive
path to the unshielded bond pads of a sensitive bias net. We could confirm that
a guard ring protection of receiver circuits suffices to absorb local substrate
disruptions, thus limiting the problem of substrate noise coupling to global
supply and ground bounce, which has to be taken into account anyway.
The substrate noise related research is treated in chapter 5.
Acknowledgements The circuit design and the overall placement decision
have been personal efforts; for the actual CMOS layout colleague Wim Meeus
should be recognized. The evaluation of the noise measurement circuit was a
joint effort with my advisor prof. Jan Van Campenhout.
Our measurement setup was inspired by Nagata et al. [2000]. The approach
is however different from theirs as localized substrate noise is observed here,
involving much higher noise frequencies than the more commonly studied
global substrate noise effects (ground bounce or ringing).
1.7 Structure of the thesis
The organization of the different chapters is as follows. In the next chapter,
the OE-VLSI approach adopted in the IO project is presented. In chapter 3,
design automation issues are addressed. Chapter 4 takes on the statistical
modeling of the components of the optical link, following the natural flow of
data through the interconnect with a respective discussion of the transmitter
side, the optical path and the receiver side. Building on these models, the
question on the feasibility of dc-unbalanced and true source-synchronous
signaling is addressed at the end of the chapter. Chapter 5 discusses our
substrate noise research. Final chapter 6 puts forward concluding remarks.
1.8 Publications
Journal papers
• De Wilde, M., Rits, O., Baets, R., and Van Campenhout, J. (2007). Synchronous
parallel optical I/O on CMOS: a case study of the uniformity issue. IEEE/OSA
Journal of Lightwave Technology. In review.
1.8 Publications 27
• O’Connor, I., Tissafi-Drissi, F., Gaffiot, F., Dambre, J., De Wilde, M., Van Camp-
enhout, J., Van Thourhout, D., Van Campenhout, J., and Stroobandt, D. (2007).
Systematic simulation-based predictive synthesis of integrated optical intercon-
nect. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. To be
published.
• De Wilde, M., Meeus, W., Rombouts, P., and Van Campenhout, J. (2006a).
A simple on-chip repetitive sampling setup for the quantification of sub-
strate noise. IEEE Journal of Solid-State Circuits, 41(5):1062–1072. Avail-
able from: http://escher.elis.ugent.be/publ/Edocs/DOC/P106_052.pdf,
doi:10.1109/JSSC.2006.872873.
Conference papers with international reviewers
• De Wilde, M., Meeus, W., and Van Campenhout, J. (2006b). Analysis of localized
high-frequency substrate noise. In Proceedings of the CDNLive! Silicon Valley
2006 Conference, volume PNP, San Jose, CA, USA. Available from: http://www.
cdnusers.org/Portals/0/cdnlive/na2006/PNP/PNP_257/257_paper.pdf.
• O’Connor, I., Tissafi-Drissi, F., Navarro, D., Mieyeville, F., Gaffiot, F., Dambre,
J., De Wilde, M., Stroobandt, D., and Briere, M. (2006a). Integrated optical
interconnect for on-chip data transport. In Proceedings of the 4th IEEE North-
East Workshop on Circuits and Systems (NEWCAS 2006), pages 209–212, Gatineau,
Canada. ISBN: 978-1-4244-0416-2. Available from: http://escher.elis.ugent.
be/publ/Edocs/DOC/P106_142.pdf, doi:10.1109/NEWCAS.2006.250937.
• Rits, O., De Wilde, M., Roelkens, G., Bockstaele, R., Annen, R., Bossard, M.,
Marion, F., and Baets, R. (2006). 2D parallel optical interconnects between CMOS
ICs. In Eldada, L. A. and Lee, E.-H., editors, Optoelectronic Integrated Circuits
VIII, volume 6124 of Proceedings of SPIE, pages 168–179. ISBN: 978-0-8194-6166-
7. Available from: http://escher.elis.ugent.be/publ/Edocs/DOC/P106_035.
pdf, doi:10.1117/12.652108.
• Bockstaele, R., De Wilde, M., Meeus, W., Sergeant, H., Rits, O., Van Campenhout,
J., De Baets, J., Van Daele, P., Dorgeuille, F., Eitel, S., Klemenc, M., Annen,
R., Van Koetsem, J., Goudeau, J., Bareel, B., Fries, R., Straub, P., Marion, F.,
Routin, J., and Baets, R. (2004b). Chip-to-chip parallel optical interconnects over
optical backpanels based on arrays of multimode waveguides. In Proceedings
of the 9th Annual Symposium of the IEEE Lasers and Electro-Optics Society (LEOS)
Benelux Chapter, pages 61–64. ISBN: 978-90-76546-06-3. Available from: http:
//leosbenelux.org/symp04/s04p061.pdf.
• Bockstaele, R., De Wilde, M., Meeus, W., Rits, O., Lambrecht, H., Van Campen-
hout, J., De Baets, J., Van Daele, P., van den Berg, E., Clemenc, M., Eitel, S., Annen,
R., Van Koetsem, J., Widawski, G., Goudeau, J., Bareel, B., Le Moine, P., Fries, R.,
Straub, P., and Baets, R. (2004a). A parallel optical interconnect link with on-chip
optical access. In Thienpont, H., Choquette, K. D., and Taghizadeh, M. R., editors,
Micro-Optics, VCSELs, and Photonic Interconnects, volume 5453 of Proceedings of
SPIE, pages 124–133. ISBN: 978-0-8194-5376-1. doi:10.1117/12.545490.
28 Introduction
• De Wilde, M., Rits, O., Bockstaele, R., Baets, R., and Van Campenhout, J.
(2003a). Design methodology development for VCSEL-based guided-wave
optical interconnects. In Proceedings of the 7th IEEE Workshop on Signal Propa-
gation on Interconnects (SPI 2003), pages 137–138, Siena, Italy. Available from:
http://escher.elis.ugent.be/publ/Edocs/DOC/P103_028.pdf.
• De Wilde, M., Rits, O., Bockstaele, R., Van Campenhout, J. M., and Baets, R. G.
(2003b). A circuit-level simulation approach to analyze system-level behavior of
VCSEL-based optical interconnects. In Thienpont, H. and Danckaert, J., editors,
VCSELs and Optical Interconnects, volume 4942 of Proceedings of SPIE, pages 247–
257. ISBN: 978-0-8194-4737-1. Available from: http://escher.elis.ugent.be/
publ/Edocs/DOC/P103_021.pdf, doi:10.1117/12.471929.
• De Wilde, M., Stroobandt, D., and Van Campenhout, J. (2002). AQUASUN:
adaptive window query processing in CAD applications for physical design
and verification. In Proceedings of the 12th ACM Great Lakes symposium on
VLSI (GLSVLSI 2002), pages 153–159, New York, USA. ISBN: 978-1-58113-
462-9. Available from: http://escher.elis.ugent.be/publ/Edocs/DOC/P102_
007.pdf, doi:10.1145/505306.505339.
• Van Campenhout, J. M., Brunfaut, M., Meeus, W., Dambre, J., and De Wilde,
M. (2001). Sense and nonsense of logic-level optical interconnect: reflections
on an experiment. In Taghizadeh, M., Thienpont, H., and Jabbour, G. E., edi-
tors, Micro- and Nano-optics for Optical Interconnection and Information Processing,
volume 4455 of Proceedings of SPIE, pages 151–159. ISBN: 978-0-8194-4169-
0. Available from: http://escher.elis.ugent.be/publ/Edocs/DOC/P101_145.
pdf, doi:10.1117/12.450436.
Conference papers without international reviewers
• De Wilde, M., Rits, O., Meeus, W., Lambrecht, H., and Van Campenhout, J. (2004).
Integration of modeling tools for parallel optical interconnects in a standard
EDA design environment. In Workshop on Parallel Optical Interconnects Inside
Electronic Systems at the 2004 Design, Automation and Test in Europe (DATE 2004)
Conference, Paris, France. Available from: http://escher.elis.ugent.be/publ/
Edocs/DOC/P104_072.pdf.
Poster presentations / abstracts
• O’Connor, I., Tissafi-Drissi, F., Navarro, D., Mieyeville, F., Gaffiot, F., Dambre, J.,
De Wilde, M., Stroobandt, D., and Van Thourhout, D. (2006b). Optical intercon-
nect for on-chip data communication. In Workshop on Future Interconnect and Net-
works on Chip (NoC) at the 2006 Design, Automation and Test in Europe (DATE 2006)
Conference, volume Poster Abstracts, page 74, Munich, Germany. Available from:
http://async.org.uk/noc2006/pdf/Proceedings_Poster_Abstracts.pdf.
• De Wilde, M. and Rits, O. (2003). Design methodology development for guided-
wave optical interconnects. In European Design and Automation Association (EDAA)
1.8 Publications 29
Ph.D. Forum at the 2003 Design, Automation and Test in Europe (DATE 2003) Confer-
ence, Munich, Germany. Available from: http://escher.elis.ugent.be/publ/
Edocs/DOC/P103_022.pdf.
• De Wilde, M., Dambre, J., and Stroobandt, D. (2001). A flexible tree-based
platform for design space exploration of hierarchical FPGA architectures. In
Second Faculty of Engineering (FirW) PhD Symposium. Ghent University. Available
from: http://escher.elis.ugent.be/publ/Edocs/DOC/P101_151.pdf.
30 Introduction
Chapter 2
OE-VLSI approach and
prototype realization
In this chapter the components involved in an OE-VLSI setup are discussed.
The first section introduces the problems associated with integrating optics
with CMOS in general, and indicates possible approaches towards a solution.
After this, the different components involved in the OE-VLSI setup of the IO
project are presented, with references to other approaches where relevant. We
start with components for optical signal generation and detection which can
be electrically interfaced. The integration of these components with CMOS
ICs is then addressed, followed by the optically accessible packaging of ICs
and the optical path itself. We complete the link by presenting the CMOS
circuits providing the interface between optoelectric components and general
logic gates.
2.1 Photonics and CMOS
As CMOS is the single dominating IC processing technology, apparently the
most straightforward approach would be the direct integration of optical
emitters and detectors in a CMOS processing flow. However, although silicon
(Si) is an ideal material for transistor realization, it turns out to be far from
straightforward to generate light with silicon. Its indirect band gap is cou-
pled with a far higher probability for electrons and holes to recombine in a
nonradiative rather than a radiative way [Ossicini et al., 2003]. This property
renders silicon badly suitable for LEDs or lasers, which are far easier realized
using direct band gap III-V compounds such as gallium arsenide (GaAs) and
indium phosphide (InP).
The second-most straightforward alternative would then be the integration of
III-V materials into the CMOS processing flow. The obstacle here is the lattice
32 OE-VLSI approach and prototype realization
constant, which is much higher for common III-V compounds than for Si. The
lattices of the crystal structures are thus incompatible, making it very difficult
to grow III-V compounds on silicon wafers.
The third alternative is the heterogeneous integration (or hybridization) of III-V
devices with CMOS, where the incompatible crystal structures are grown
separately and stuck together by some method. Arguably the simplest OE-
VLSI approach—and the one used in the IO project—is the flip-chip bonding
of a piece of III-V wafer containing O-E devices to the CMOS [Goodwin et al.,
1991]. In this approach, tiny solder balls cater for the electrical connectivity
as well as the alignment and initial cohesion (before underfill) between both
wafers.
All three alternatives have been heavily researched since the 1990s. A tighter
integration than flip-chip bonding is indeed desirable, especially if one wishes
to consider intra-chip optical links instead of the inter- and intra-board links at
issue here. Although silicon photonics approaches towards on-chip interconnect
are beyond the scope of this text, we want to draw attention to some very
recent developments.
Silicon-only lasers have been demonstrated by Rong et al. [2005], reducing
the nonradiative recombination by embedding a reverse-biased p-i-n diode
in a silicon waveguide. A recent example of the second approach is the
development of a multiple quantum well (MQW) modulator exploiting the
quantum-confined Stark effect [Miller et al., 1984] in thin germanium (Ge)
structures which can be grown on silicon [Kuo et al., 2005, 2006]. There have
also been attempts to grow gallium arsenide on silicon with conformal epitaxy
[Neefs, 2000b]. Regarding the third approach, using direct heterogeneous
wafer bonding [Tong and Gösele, 1999]—uniting ultra-flat surfaces through
Van Der Waals forces—III-V devices can be directly coupled to silicon-on-
insulator waveguides [Roelkens et al., 2005]. Heterogeneous bonding with
wafer-scale processing methods permits to realize many devices in a low
number of steps. A very recent development even makes lasers with a hybrid
III-V/Si cavity possible [Paniccia et al., 2006].
2.2 Light generation
2.2.1 Generation or modulation?
Approaches for the generation of a modulated light beam are firstly distin-
guishable in two schools of thought: either the drive of a light source is
adjusted in time, or an invariable external optical beam is modulated.
For chip-to-chip interconnect, the variable-drive method is chiefly embodied
by vertical-cavity surface-emitting lasers (VCSELs) [Soda et al., 1979]. Other
manifestations such as LEDs and resonant-cavity LEDs are popular as well; yet
they cannot attain the near Gaussian-beam directionality of a laser which eases
2.2 Light generation 33
substrate
p-type DBR
n-type DBR
anode
emission
cathodecathode
active region with quantum wells
oxide aperture
Figure 2.1: Essential structure of a bottom-emitting VCSEL (schematic cross-section
not to scale). The curved lines over the active region illustrate transverse intensity
patterns of different modes.
an efficient waveguide coupling. The common workhorse of the modulation
approach are multiple quantum well (MQW) modulators [Boyd et al., 1987].
Both approaches have their own benefits and disadvantages. For instance,
VCSELs benefit from not requiring the complexity of an external light source,
but the disadvantage is the supply, switching and dissipation of larger powers
in the electronic section. MQW modulators can be embedded in silicon, but
traditionally suffer from a worse contrast than VCSELs. A recent comparison
[Cho et al., 2006] suggests that the VCSEL approach is superior to the MQW
approach at relatively high bandwidths or high detector capacitances, and
vice versa in the lower case; still, taking note of the cutting-edge advances on
both fronts mentioned above, it is unclear when either method prevails.
2.2.2 VCSELs
A VCSEL is a laser realized with semiconductor processing methods on a
wafer of III-V material [Soda et al., 1979; Iga et al., 1984; Koyama et al., 1989].
The typical structure is shown in figure 2.1. Like any laser, the operation relies
on a resonant cavity and a gain medium. The resonant cavity is a spatial region
to which certain optical beam manifestations are confined by two opposing
mirrors parallel to the wafer surface. Each mirror is a distributed Bragg
reflector (DBR)—a succession of material layers with alternating different
refractive indices, through interference efficiently reflecting light around the
targeted emission wavelength. A reduced reflectivity of one mirror lets
part of the confined radiation escape, yielding an outbound optical beam
34 OE-VLSI approach and prototype realization
perpendicular to the wafer surface.
The gain medium is a diode junction in between both DBR mirrors—which are
oppositely doped and electrically contacted—with one or more quantum wells
sandwiched in between. These very thin layers concentrate injected carriers in
energy states that contribute to stimulated emission, a process which amplifies
the confined optical radiation when passing through this active region. Lateral
confinement of carriers and photons is generally enforced by a round aperture
(2–25 µm) in a buried oxide layer next to the quantum wells. When this
diameter is larger than a few microns, more than one optical standing wave
pattern (a mode) can satisfy the boundary conditions of the resonant cavity; the
VCSEL is then called a multimode VCSEL as opposed to a single-mode VCSEL.
Unlike modes of edge-emitting layers, all modes of a VCSEL exhibit nearly
the same wavelength, as enforced by the DBR mirrors, but show different
transverse intensity patterns. The emission of higher-order modes is more
divergent than the fundamental mode and can even exhibit a minimum in the
center (this effect is called spatial hole burning).
Various factors contribute in positive and negative ways to carrier and photon
concentrations; for an in-depth treatment we refer to Jungo [2003]; Zhang
et al. [2004]; Yu [2003]; Li and Iga [2002]; see also section 3.3.3. The observable
behavior is essentially a proportional relation between the outbound beam
power and the injected current beyond a certain threshold current, generally
in the mA–range. Above threshold, current changes result in optical power
changes with a second order dynamic transient response.
VCSELs have a number of desirable properties for parallel optical interconnect
applications: they are small, efficient and easily realized in two-dimensional
arrays. Furthermore, they can be tested on-wafer, permitting defect detection
before further processing.
2.2.3 Bottom-side emission
The creation of structures on and in a semiconductor wafer generally takes
place at only one side (the ‘top’ side); all but a thin layer remains bulk material
(the ‘substrate’). It is customary to refer to the top or bottom side of a wafer
or die using this frame of reference, rather than referring to the physical
orientation.
Top-emitting 850-nm VCSELs are the standard for data communication, and
are used in one-dimensional parallel optical links. The flip-chip bonding
of VCSELs with CMOS however calls for a special approach. As the metal
contacts are created at the top side, after flip-chip bonding this side will be
facing the CMOS. The VCSELs thus need to be bottom-emitting, involving
beam propagation through the substrate. This is a major problem for 850-nm
VCSELs, as emission at this wavelength is absorbed in the conventional GaAs
substrate within a few micrometers. Clearly, other approaches must be used,
such as the complete removal of the substrate, or the use of electrical vias. We
2.2 Light generation 35
250 µm
anodecathode
Figure 2.2: Photograph of an 8× 8 VCSEL array die from Avalon Photonics—used in
the IO project—and a SEM close-up of one VCSEL.
mention that top-emitting approaches exist as well, with emission through
the CMOS substrate instead of the III-V substrate, where an ultra-thin layer of
silicon is grown on a transparent sapphire substrate to avoid absorption in
silicon [Liu et al., 2003].
2.2.4 Choice of wavelength
In the IO project, a wavelength different from 850 nm is used. The VCSELs are
based on indium gallium arsenide (InGaAs) quantum wells which stimulate
emission in the 960–980 nm range. There is a trade-off involving the precise
wavelength choice. As will be shown in the next section, the detector side
employs a hybridization approach similar to that of the VCSELs, resulting
in substrate-illuminated photodiodes. The emission wavelength should be
Figure 2.3: Photographs of infrared camera monitors showing light being emitted from
the VCSEL array.
36 OE-VLSI approach and prototype realization
sufficiently large, to avoid absorption of the generated light in the indium
phosphide (InP) substrate of the photodiodes when the wavelength approaches
950 nm [Beaudoin et al., 1997], especially at elevated temperatures. The
emission of wavelengths larger than 980 nm however requires a high Indium
content in the active region of the VCSEL, increasing the mechanical stress due
to crystal lattice mismatch. This mechanical stress might limit the lifetime of
the devices. Therefore, a compromise of 970 nm was made as to the emission
wavelength.
VCSELs have been realized in 8× 8 and 16× 16 arrays on a 250–µm pitch. A
large maximal output power level of 5mW was established to enable testing
of optical pathways with up to 10 dB loss. Oxide apertures of 8, 10 and 12 µm
were realized to evaluate the trade-off between the threshold current and the
efficiency. Ultimately, the VCSELs with 12–µm oxide apertures have been used
for the demonstrators as this allowed for the fastest modulation (at the cost of
an increased threshold current). Figure 2.2 shows a photograph of an 8× 8
VCSEL array die and a close-up of one VCSEL.
2.3 Light detection
2.3.1 External or integrated?
Optical detectors can either be realized on a separate III-V wafer—as are our
VCSELs—or directly integrated in CMOS technology; this is much more easily
achieved than in the VCSEL case. In the literature, a lot of work can be found
on the development of high-speed detectors designed and fabricated directly
in CMOS technology; for an overview we refer to Zimmermann [2004].
Integrated photodetectors avoid the cost of flip-chip bonding and external de-
tector manufacturing. External detectors have the advantage of not occupying
CMOS chip area, thus avoiding a trade-off between detector size and receiver
circuit size which restricts the attainable signaling rate. Furthermore, the
960–980 nm emission range at hand is at the high side of the silicon absorption
spectrum, yielding a reduced responsivity for CMOS photodetectors [Simpson
et al., 1999].
2.3.2 p-i-n photodiodes
In the IO project, external photodetector arrays were used. The associated
extra cost is not too high, as the post-processing of the CMOS before flip-chip
bonding is a wafer-scale process required anyway for the VCSEL arrays. A
practical advantage that also accompanies this choice is the full independence
of the CMOS technology as to the realization of O-E conversion devices.
The photodetectors of the IO project are positive-intrinsic-negative (p-i-n)
photodiodes [Watanabe and Nishizawa, 1950]. The typical structure is shown
2.3 Light detection 37
substrate (intrinsic InP)
+p -type InGaAs
+n -type InP
intrinsic InGaAs
anode
cathode cathode
illumination
Figure 2.4: General structure of a p-i-n photodiode (schematic cross-section not to
scale)
in figure 2.4. Sandwiched between the p-n structure of a diode, an intrinsic
region is realized. The motivation for this is the manifestation of a very large
depleted region when a reverse bias voltage is applied. An incident photon
with a suitable energy can excite a mobile electron, creating an electron-hole
pair. When this occurs in the depleted region, the electron and the hole
quickly separate under the influence of the electrical field there. Incident light
of a suitable wavelength thus gives rise to a proportional electrical current
in addition to the very small saturation current of the diode (in this context
250 µm
80 µm
cathode
anode
Figure 2.5: Photograph of an 8× 8 photodiode array die from Albis Optoelectronics—
used in the IO project—and close-up of one photodiode.
38 OE-VLSI approach and prototype realization
referred to as the ‘dark current’). The charge depletion gives rise to a sizeable
photodiode capacitance which has to be taken into account in receiver circuit
designs. The incremental capacitance decreases with higher bias voltages.
The active region of the diodes consists of an InGaAs absorbing layer, grown on
an InP substrate. The layer structure is optimized for high speed and maximal
absorption. An important parameter is the diameter of the detector. This
should be sufficiently large to capture all light from the fiber, as light missing
the detector reduces the received signal amplitude and could even eventually
be absorbed in an adjoining device, inducing crosstalk there. The optimal
detector diameter is a trade-off between the fiber core diameter, numerical
aperture and alignment tolerance on the one hand, and the photodiode
capacitance—which increases with device size—on the other. Detectors with
an 80–µm diameter have been used, which should tolerate a wide choice of
fibers. The dark current at room temperature and 2V reverse bias voltage is
6–7 nA; the diode capacitance is 505 fF on average. Uniform 8× 8 and 16× 16
arrays have been realized on a 250–µm pitch. Figure 2.2 shows a photograph
of an 8× 8 photodiode array die and a close-up of one device.
2.4 Flip-chip bonding
The heterogeneous integration of the VCSEL and photodiode arrays with
CMOS is performed using flip-chip bonding. In the IO project, an indium
(In) bump technology has been applied, an approach that has been first
demonstrated for O-E arrays on CMOS by Jahns et al. [1992]. The ductility of
indium decreases the mechanical stresses at ball joints, thereby increasing the
yield and lifetime.
The steps involved are shown in figure 2.6. The CMOS wafers are first
post-processed: solder is deposited on the wafer and reflown. Due to the
self-aligning effect during reflow, the accuracy on the position of the bumps
is better than ±0.5 µm. After the reflow, a polymer underfill is applied to
distribute stresses over the surface rather than acting only on the joints. This
underfill also protects the indium bumps when exposed to subsequent high-
temperature steps in further processing (such as the mounting of the BGA
package on the PCB). The O-E arrays are then thinned from 150 µm to 30 µm
to avoid optical absorption in the substrate and to allow fibers to approach the
O-E devices more closely. Before dicing, an anti-reflective coating is applied
to suppress Fresnel reflection losses. Figure 2.7 displays part of a hybridized
CMOS wafer with O-E arrays before dicing.
2.4 Flip-chip bonding 39
(a)
(b)
(c)
(e)
(f)
(d)
Figure 2.6: Schematic overview of steps involved in the flip-chip bonding process. (a)
CMOS bumping; (b) installation of VCSEL and photodiode arrays; (c) underfill; (d)
thinning; (e) application of anti-reflective coating; (f) dicing.
8×8
RX
8×8
TX
8×8
RX
8×8
TX
8×8
RX
8×8
TX
8×8
RX
8×8
TX
Figure 2.7: VCSEL and detector arrays flip-chipped on a digital CMOS wafer containing
DTA ICs. The bumps above and below the O-E arrays (the light grey areas) are for
carrying a silicon block for MT pin alignment (see section 2.5). The outer bump ring is
used for a possible chip-scale packaging approach.
40 OE-VLSI approach and prototype realization
2.5 Packaging and optical path
The packaging of OE-VLSI modules differs from the packaging of purely
electronic chips. Besides the common electrical, mechanical (IC to PCB) and
thermal (heat sink) interfaces, the package needs to provide an optical interface
from the chip to external optical waveguides as well. The adopted packaging
approach and the realization of optical waveguides have to be considered as a
whole, as the optical features in the package are only there to smoothly mate
with external waveguides.
2.5.1 Optical path: build-up or PCB-integrated?
There is a choice related to the combination of the optical path and the PCB that
carries the OE-VLSI chip: either the waveguides are realized inside the PCB,
or they exist on the outside using some build-up technology. One expects the
integrated approach to yield the most attractive end result. However, a large
amount of technological complexity is involved concerning the manufacturing
and embedding of the waveguides, routing aspects (how to cross over?), and
particularly, the coupling at waveguide endpoints with large two-dimensional
optical arrays of OE-VLSI modules, or optical connectors at the PCB edge. At
the beginning of the IO project, this complexity was still largely untreated
and even now, five years later, there are no embedded solutions addressing all
aspects at once.
Nevertheless, in the mean time there have been a number of successful partial
solutions as well as full systems avoiding crossovers, very large O-E arrays and
PCB edge connectors. Figure 2.8 illustrates the general idea of PCB-embedded
optical interconnect—exact positions of lenses and waveguides may still vary.
For a presentation of a recent full system realization and pointers to other
approaches, we refer to Schares et al. [2006].
2.5.2 MT/MPO-like approach
Although the IO project also targeted a chip-scale packaging approach [Marion
et al., 2004] for opto–PCB integration [Van Steenberge et al., 2006; Straub,
2004], for the main demonstrator a fiber-based build-up technology was
envisaged. The advantage is one of a practical kind: standard PCB technology
is applicable, and even standard pin grid array (PGA) or ball grid array (BGA)
packages are usable, except for the lid of the package which would provide
the optical access.
The interface of the package with the optical path has been inspired by the
mechanically transferable (MT) connector standard [IEC 61754-5] and is close
to the multi-fiber push on (MPO) connector standard [IEC 61754-7]. Figure
2.9 displays a 6× 12 MPO connector. The chief MT/MPO interface features
2.5 Packaging and optical path 41
45° mirrors
CMOS VCSEL array photodiode array
possible lens
PCB
topside or embedded waveguides
special package
Figure 2.8: General idea of the predominant approach adopted in present research on
OE-VLSI systems with PCB-embedded waveguides (cross-section).
Figure 2.9: View into the mating face of a 6× 12 MPO connector (provided by Tyco).
0.7 mm
hole/pin
4.6 mm
0.25 mm
hole/pin
different channel positions
Figure 2.10: Relevant dimensional features of a MT ferrule (view on the mating face).
The interface positions in green represent the 8× 8 positions used in the IO project;
the positions in red refer to a standardized 6× 12 MPO connector.
42 OE-VLSI approach and prototype realization
are the following (see figure 2.10). Accurate mechanical alignment between
the two parties involved is provided by two parallel guide pins (0.7mm)
at a spacing of 4.6mm attached to the mating part of one party, fitting into
corresponding holes at the mating part of the other party. Centralized between
these features, different optical channels terminate in a planar mating surface,
organized in a matrix at a pitch of 250 µm. To avoid row/column confusion,
the established terminology defines a row being parallel to the imaginary
line connecting the alignment pins/holes at the mating surface. Standard
MT defines a single row of 4, 8 or 12 channels; MPO extends this to 2 and 6
12–channel rows. The IO project choice for 8 8-channel rows is historically
driven by the OIIC project and the fact that multirow MPO was not yet well
established at the moment of decision. The exact choice between similar array
sizes is however more of a practical than a fundamental kind, though having
more columns than rows is arguably better to minimize the effect of guide pin
misalignment.
The similarity of this approach to MT/MPO standards and the present devel-
opmental stage of commercial multi-row transceivers add to the applicability
of our coupling efficiency characterization between VCSEL arrays and multi-
fiber connectors, which is discussed in section 4.3.
2.5.3 Package
2.5.3.1 General scheme
The IO packaging approach is applicable to general PGA and BGA packages.
Instead of a general featureless lid sealing the package, a special lid is used
with MT guide pins and a hole which comes above the O-E arrays after
application. Figure 2.11 shows the assembled package for the DTA IC, a
ceramic BGA with 302 balls.
Aiming for simplicity, the coupling of fibers to the O-E arrays is performed
through direct fiber butt coupling. Any connector with standard MT dimen-
sions can mate with the package. Still, for an efficient butt coupling the front
connector surface needs to be closer to the O-E arrays than is possible with
a common flat-front connector, given the thickness of a robust lid and its
minimal distance to the CMOS surface (to cater for bond wires). For this
reason an extra protrusion of 1.5mm of the connector through the hole in the
lid was adopted, yielding a working distance targeted at 50± 40 µm between
the mating surface of the connector and the accessible surface of the O-E
arrays. Figure 2.12 shows different connectors that have been used.
The dice in the experimental packages have remained bare. Encapsulation of
the dice will be required for long-term dust and moisture protection. Tech-
niques involving a transparent glass lid, polymeric encapsulants such as
silicone and parylene, or a microlens array, face plate or fiber array pigtail
glued into the lid could be applied to future packages, admittedly reducing
2.5 Packaging and optical path 43
Figure 2.11: Photograph of an assembled package with integrated optical access.
Two sets of MT guide pins—one set for the VCSEL array an one for the photodiode
array—are clearly visible (photo by CEA-LETI).
the simplicity of the approach if this involves an extra alignment step. Alter-
natively, the connector could be glued into the lid for a semi-hermetic end
product (yet less flexible).
The position of the lid relative to the hybridized CMOS has to be very accurate,
as this determines the quality of the optical coupling to (and from) the fibers.
For the ease of explanation, we designate position changes parallel to the
CMOS surface as XY-movements, and appoint the normal with the letter Z.
The position of the alignment pins determines the XY alignment; the vertical
position of the lid established the Z alignment.
Two separate packaging approaches have been developed: one using silicon
blocks for precise XY alignment, and one XYZ technique using index alignment
during assembly [Rits et al., 2004, 2006].
2.5.3.2 Silicon bench approach
The silicon bench approach is shown in figure 2.13. Two silicon blocks with
precision holes for the alignment pins are flip-chipped onto the CMOS, next
to the O-E arrays (the solder balls for this operation are also visible in figure
2.7). The precision of the XY alignment is determined by the precision of the
block and of the flip-chip process. This can be very accurate, typically around
1 µm. The alignment pins are glued into the silicon benches. After application
44 OE-VLSI approach and prototype realization
(a) (b)
(c)
Figure 2.12: (a) an 8× 8 molded ceramic ferrule with POF ribbons (design by FCI).
(b) a commercially available 6× 12-ribbon multimode glass fiber cable with multipath
push-on (MPO) connectors that have been milled at the sides to create the required
protrusion (kindly provided by TYCO). (c) a connector prototype with alignment
features realized by deep proton writing (kindly provided by Michael Vervaeke of
the TONA department of the Vrije Universiteit Brussel; see Debaes et al. [2006] for
information on this technique).
2.5 Packaging and optical path 45
Figure 2.13: Packaging approach using silicon benches (photos by CEA-LETI)
Figure 2.14: Index alignment packaging approach (photos courtesy of Rits et al. [2004]).
Figure 2.15: 12-wide graded-index glass fiber ribbon with a fiber-to-fiber pitch of
250 µm (cable provided by Tyco).
46 OE-VLSI approach and prototype realization
of a lid with deliberately oversized holes, the pins are additionally glued into
these holes in order to absorb forces in the lid instead of the benches.
The advantage of this approach is the high XY accuracy of the silicon benches.
No special considerations have to be made in the CMOS design except for
a big enough chip to hold the benches; the solder balls are placed on the
topmost passivation layer.
There are however some disadvantages. The Z alignment is to be provided by
other means. Although the base of the guiding pins is accurately positioned,
if the pins are even slightly off plumb, the XY alignment at the mating face of
the lid is impaired. When a connector is then pushed onto the lid—forcing
the mating face of the connector holes to align with the lid—this will force
the guiding pins upright. The glue in the lid holes will maintain the pin
positions there but cannot absorb the resulting torque, which is transferred to
an equivalent XY force at the base of the pins. The brittle silicon benches are
not resilient to such a force and will easily break.
2.5.3.3 Three-dimensional index alignment approach
The index alignment approach is illustrated in figure 2.14. A coordinate
measurement machine is used to discover the three-dimensional position of
marks on a lid with integrated pins relative to marks on the hybridized CMOS.
Using actuators, the lid is brought to the optimal location in a controlled way.
This optimal alignment is finally made permanent by a UV-curable adhesive.
For more information on this approach we refer to Rits et al. [2004].
The main advantage of this approach is that it is contactless w.r.t. the CMOS,
offering full 3-D alignment of 2 independent objects. The positioning accuracy
is worse than that of the silicon bench approach. It is estimated at ±5 µm,
which will turn out to be sufficient for this application (we refer to section
4.3.5 on the array-wide VCSEL-fiber coupling efficiency).
2.5.4 Optical path
The optical path is provided by graded-index fibers which exhibit excellent
properties; these are discussed in section 4.3.1. A cable for a two-dimensional
array of optical fibers consists of multiple fiber ribbons (figure 2.15) which
are stacked and terminated in a connector. Figure 2.16 shows the insertion of
such a cable into a DTA module.
2.6 O-E interfacing circuits
CMOS logic gates cannot directly modulate VCSELs and do not accept photo-
diode currents; to this end interfacing circuits are required. For each outgoing
2.6 O-E interfacing circuits 47
Figure 2.16: Manual insertion of a fiber ribbon stack cable
48 OE-VLSI approach and prototype realization
supply
T2
data
T1
I1
I  or 0mod
T3
T1
T2
T3
anode
cathode
(a) Essential schematic (b) 250 µm× 250 µm CMOS layout
Figure 2.17: Driver circuit on the DTA IC (in 0.35–µm CMOS; design courtesy of Helix
AG semiconductors)
signal, a driver circuit converts a CMOS logic input into a corresponding
current suitable for direct VCSEL modulation. At the receiver side, each
photodiode’s photocurrent is converted back into a CMOS logic output.
In a true OE-VLSI design, driver and receiver cells are directly integrated in
CMOS and located directly underneath the O-E arrays. This restricts the size
of the circuits in two dimensions to the pitch of the O-E devices. Hence, the
room for extra amplifier stages and decoupling provisions is limited.
supply
T1
I1
T2
T3
T4
T5
T6
T7
3.3 V PMOS
1.8 V NMOS
1 0 1
time
VC
SE
L 
cu
rr
e
n
t
(a) Essential schematic (b) Current peaking principle
Figure 2.18: Driver circuit used for the 0.18–µm CMOS design (design courtesy of
Helix AG semiconductors). The NMOS transistors are driven in much the same way
as in figure 2.17(a); T3, T5 and T7 are current sources; the switches T2, T4 and T6 are
driven with a temporal dependency on the digital input as indicated on subfigure (b).
2.6 O-E interfacing circuits 49
2.6.1 Driver circuit
The basic functionality of a VCSEL driving circuit is switching the laser current
between two levels I0 and I1 as dictated by a digital input. The on-current I1
determines the maximal optical output power. The off-current I0 should be
low for a good contrast, but still well above the VCSEL threshold current Ith
to avoid turn-on delay and a slow dynamic laser response.
In the DTA IC, a simple but adequate topology was used (figure 2.17(a)). The
high-side p-FET T1 drives the constant current I1 into the VCSEL and a shunt
path, which is switchable through n-FET T2, diverting the modulation current
I1 − I0 = Imod away from the VCSEL using constant-current n-FET T3. When
the shunt path is active, the resulting VCSEL current is I1 − (I1 − I0) = I0,
otherwise it remains I1.
Using 0.35–µm CMOS, this approach is fast enough for 1.25Gbps signaling. In
the DTA IC, T1 is able to drive a maximal on-current of 12mA and T3 permits
a maximal modulation current of 8mA. Static power consumption is about
3.3V times the on-current; the additional dynamic dissipation is 2.7mW (at a
1.25Gbps pseudorandom sequence; 1mA/5mA drive). In CMOS technology
with smaller features, faster signaling is possible yet several problems arise
[Annen and Melchior, 2002; Razavi, 2003]:
• VCSEL current and voltage levels do not necessarily scale down at the
same rate as general logic cells. The fast low-voltage CMOS transistors
in 0.13–µm and smaller technologies cannot cope with source-drain volt-
ages larger than 1.5V—about the magnitude of the operational VCSEL
voltage. A solution here is to feed the VCSEL anode from an elevated
supply, and to use only low-side drive circuits. The solution applied for
the IO project’s 0.18–µm CMOS module is to use a slow 3.3–V transis-
tor for T1—this transistor is essentially static anyway—and fast 1.8–V
transistors for T2 (T4, T6) and T3 (T5, T7) (see figure 2.18(a)).
• The intrinsic VCSEL bandwidth does not necessarily scale with the
CMOS switching rate as well. Beyond a certain frequency—about 4–
5GHz for the IO project VCSELs—higher frequencies are attenuated.
This can be mitigated by an equalization approach (see also section 1.4.2.3
which mentions equalization in the context of electrical interconnect).
Figure 2.18 shows the approach adopted in the 0.18–µm CMOS design:
the low-pass effect of the VCSELs is compensated by a short initial
peaking of the current [Annen and Melchior, 2002].
• In a single-ended setup, power and ground noise easily couple into
the transmitted signal, especially at reduced supply voltages. Further-
more, large MOSFETs switching between saturated and non-saturated
regions of operation take considerable time to do so, which can limit
the attainable bandwidth increase. A solution is to apply a differential
50 OE-VLSI approach and prototype realization
supply
transimpedance
amplifier
limiting
amplifier
postamplifier
decision level control
output
AC-JTAG cell: boundary scan
and on-wafer circuit testing
Figure 2.19: Simplified view of the receiver circuit. The transimpedance amplifier is
shown in more detail in figure 2.20.
driver topology, which is robust to power-ground noise and maintains
saturation in all constant-current MOSFETs. We refer to Sialm et al.
[2006] for more information on differential driver design.
2.6.2 Receiver circuit
The purpose of a receiver circuit is the conversion of the photodiode current
signal into a rail-to-rail voltage signal suited to drive CMOS gates. For a recent
in-depth review of CMOS receiver circuits we refer to Emami-Neyestanak
[2004]. The current-to-voltage conversion is performed either by a transimpe-
dance amplifier (TIA) or an integrating front-end circuit. In the TIA approach,
the photocurrent signal is passed through a resistor, yielding a voltage signal
which is amplified further on. In an integrating front-end, the photocurrent
is integrated over a capacitor, the voltage across which is sampled at regular
intervals; successive samples are compared to reconstruct the digital signal.
In the IO project hardware, the more common TIA approach has been applied;
it allows the separation between the reconstruction of an optical signal to
CMOS voltage levels and the temporal interpretation of this signal as a
sequence of bits. A benefit is testability: the uninterpreted signal can be
sampled off-chip if desired, or diagnosed with an oscilloscope. It additionally
allows on-chip per-channel retiming circuits to be placed at a distance, outside
the array. Figure 2.19 shows the building blocks of the circuit used in the
0.35–µm CMOS DTA IC design. We will now briefly discuss the composition
of this setup.
Small current, high capacitance Photocurrents range from a few µA to a
few tens of µA and thus are small. As mentioned above, a photodiode
2.6 O-E interfacing circuits 51
current input voltage output
photocurrent sink
gm gm
feedback
Figure 2.20: Transimpedance amplifier of the DTA IC in 0.35–µm CMOS with two
transconductance(gm)-transresistance stages (design courtesy of Helix AG semicon-
ductors).
exhibits a sizeable capacitance (in the IO project about 0.5 pF). If the voltage
across a photodiode changes, ensuing parasitic currents will be added to the
photocurrent signal. For instance, a voltage change of only 10mV over 1 ns
will contribute an average parasitic current of about 5 µA. The combination
of the high transimpedance needed to convert the photocurrent to a sizeable
voltage, and the low input impedance of the receiver required for a steady
voltage across the photodiode, are counteracting constraints.
A transimpedance amplifier (TIA) is a circuit realizing both the low input
impedance and the high transimpedance requirement. Figure 2.20 shows the
TIA architecture used in the DTA IC. Two transconductance-transresistance
stages constitute a voltage amplifier. The sizeable voltage swing at the output
controls a transistor which assimilates the photocurrent; the large voltage
amplification keeps the input voltage much more steady than the output
voltage.
Limiting amplifier and decision control Although the output of the TIA
is a voltage signal, its swing is still rather limited if a high bandwidth is
envisaged. Therefore a second amplifier is put into action which amplifies
its voltage input to a signal adhering to standard CMOS logic levels. An
additional postamplifier protects the limiting amplifier from having to drive
high capacitive loads.
In the DTA IC, the limiting amplifier is simply a CMOS inverter. A low-pass
filter on the output of this amplifier (with a time constant in the µs range)
provides decision level control (see figure 2.19). It controls a photodiode-
shunting current source which strives to bias this inverter around its midpoint.
Hence the decision control loop adapts to an unknown average photocurrent.
This however necessitates dc-balanced signaling to avoid the decision point
running away during otherwise long runs of 0s or 1s. In section 4.5 we
examine whether it is feasible to open the control loop and feed it from
52 OE-VLSI approach and prototype realization
another channel, thus relieving us from the dc balancing requirement.
Test provisions Besides the components required for correct operation of the
receiver circuit, there are test provisions as well, which allow boundary scan
testing and the functional verification of the operation of the receiver circuit
in itself. We refer to section 3.4 for additional information on this subject.
Bandwidth and sensitivity Although the input voltage is more steady than
the output voltage, it is not entirely constant: if a photocurrent swing ∆Iin
yields an output swing ∆Vout = R · ∆Iin (where R is the transimpedance),
the input voltage will exhibit a voltage swing of ∆Vout/Gamp (where Gamp is
the gain of the gm/gm amplifier). The effective input impedance R/Gamp,
combined with the aggregate capacitance C (photodiode depletion and input
MOSFET), brings on a first-order response with time constant τ = RC/Gamp—
the dominating time constant which limits the bandwidth of the system.
The transimpedance R fixes the minimal sensitivity of the receiver, as the limit-
ing amplifier needs a ∆Vout ≥ 50mV in order to limit intersymbol interference
to a reasonable amount (as predicted by worst-case simulations for 0.35–µm
CMOS at 1.25Gbps).
The bandwidth Bamp of the gm/gm voltage amplifier has to be high enough:
the associated time constant τamp = 1/
(
2piBamp
)
should not exceed τ/2 to
ensure the stability of the feedback loop of the TIA. This constraint fixes the
minimal tail current of the amplifier stages and hence the power consumption
of the TIA.
The design sensitivity has been established at 40 µA peak-to-peak photocurrent,
which is much higher than the thermal noise level. This choice was made
to minimize the overall link dissipation while allowing up to 7dB loss in
the optical path. The eventual total receiver power consumption amounts
to 10mW. The sensitivity choice additionally assures high resilience against
noise from other channels and digital logic.
0.18–µm CMOS receiver For the receiver design in 0.18–µm CMOS technol-
ogy, the gm/gm amplifier approach used in the DTA IC is suboptimal as
the transistor properties make it difficult to realize a robust gain with this
topology. A differential amplifier circuit was adopted (figure 2.22), with a
cascode configuration to maximize the attainable gain-bandwidth product.
The attainable bandwidth is 2.5Gbps, with a power consumption of 90mW
and a sensitivity of 10 µA peak-to-peak photocurrent.
The differential approach provides resilience against supply noise and ground
bounce as well. The reason for this resilience is the following. As shown in
figure 2.21, the positive input of the TIA is connected to the photodiode, and
the negative input to a capacitor matching the photodiode capacitance. Any
2.6 O-E interfacing circuits 53
photodiode bias
output -
dc bias
TIA
dummy photodiode
(only its capacitance)
output +
further amplification
Figure 2.21: Fully differential topology of the 0.18–µm CMOS receiver circuit. The
transimpedance amplifier is shown in more detail in figure 2.22.
1.8V supply
bias
in+ in-
out- out+ca
sc
o
de
Figure 2.22: Differential transimpedance amplifier of the 0.18–µm CMOS receiver
circuit (design courtesy of Helix AG semiconductors).
54 OE-VLSI approach and prototype realization
current injected by a voltage disturbance across the photodiode will thus cause
an identical injection from the capacitor. Both contributions will be subtracted
from each other by the TIA, yielding a zero net effect. In practice there was
a noticeable effect, caused by a higher-than-expected effective photodiode
capacitance due to the high parasitics of the wiring of the metal-on-glass
carrier (shown in section A.3.3).
Chapter 3
Design automation issues
In this chapter we discuss basic electronic design automation (EDA) for OE-
VLSI. The discussion addresses methodological issues. Most efforts reported
upon in this chapter have been performed in 2002, several years before the time
of writing (2007). We have included references to recent scientific publications
where applicable.
The design of OE-VLSI systems consists of several steps, each needing some
form of EDA support. Some major steps related to the O-E aspect are the
following.
1. Design creation
This comprises the necessary EDA tool support at the IC and PCB
level for instantiating and designing different elements of the optical
interconnect: component libraries, interface definitions, and placement
and routing support.
2. Design simulation
After design creation and before production, the designer should be able
to simulate the optical interconnect within the context of a digital system
in order to verify the configuration of the optical interconnect and the
timing closure of the digital system, e.g., at a small set of target operating
points. This requires analog simulation models for the elements of the
optical link that can interface with one another in a circuit-level simulator.
Additionally, mixed-signal simulation of the interconnect with the digital
system should be possible.
3. Extraction of system-level properties
This point deals with the extraction of important system-level properties
of the optical interconnect for a certain interconnect configuration: over-
all feasibility, timing characteristics (delay, jitter, skew, rise/fall times),
reliability (e.g., bit error rate (BER)) and production and operating costs.
56 Design automation issues
The extracted timing characteristics can be used in a digital timing
analyzer to verify timing closure for a certain digital design without
the need for a mixed-signal simulation. For the extraction of reliability
figures, modeling of different kinds of randomness and uncertainty in
various locations of the interconnect is required.
4. Design space exploration
Design space exploration support assists a system designer to attain the
optimal trade-off between system-level interconnect properties, after a
design-specific specification of boundary conditions and a global cost
function for these properties. This means that the EDA tool should
be able to assess design choices in order to comply with requested
interconnect properties for a given design. To this end, system-level
interconnect properties should be extracted over the design trajectory of
the parallel optical interconnect. An analysis of these data can then be
used to make this estimation.
The succession of the different steps above corresponds to the viewpoint of an
EDA tool developer. It is a bottom-up approach, where the implementation of
latter steps depends on the former ones. A digital system designer will likely
adopt more of a top-down approach. Starting from the boundary conditions of
the system at hand, firstly the design space will be explored (step 4) in search
of an optimal trade-off of predicted system-level properties (step 3). Then the
design will be created (step 1) and simulated (step 2) in order to verify wheter
the predicted system level properties are actually achieved (again step 3).
3.1 Methodological aspects
We now consider a proposed methodology for implementing this support,
and refer to more detailed discussions of realizations made.
3.1.1 Design creation
At the IC level, pads for electrical interconnect are usually provided as library
items that hide the internal complexity of internal buffers and electrostatic
discharge (ESD) protection provisions from the designer. The IC designer
receives an opaque layout view of the pad, where the only marked parts
are the outline, the location of the physical pad, an internal virtual pin to
connect the signal to, and some power tracks. For the driver and receiver
circuits on which the optical arrays are flip-chipped, the same approach can
be followed. Support for interconnect testing after production (boundary scan
testing) could be implemented in a way compatible with standard JTAG (Joint
Test Action Group) provisions (see subsection 3.4).
3.1 Methodological aspects 57
At the PCB level, the physical design of the waveguide routing requires a
substantial extension of the functionality of traditional PCB design tools.
Different experimental technologies require different specialized software
support. Ribbon stack and flex approaches—which are build-up technologies
on the PCB—do not interfere with the PCB layout router. PCB placement
tools are only involved in the placement of the OE-VLSI module and possible
optical card edge connectors; only standard package geometry definitions are
involved. The exact location of the involved components can be easily exported
to a file to serve as a set of input boundary conditions for a specialized routing
tool.
In contrast with the build-up approach, a waveguide-embedded approach
requires tight integration with PCB placement and routing tools (e.g., to
avoid collisions with via holes, and to control geometrical aspects such as the
spacing and curvature of the waveguides). Such PCB-level EDA support is
not addressed here because of the high technological specificity of different
experimental embedded-waveguide approaches. An analysis of the techno-
logical characteristics of one approach and the associated implementation of
placement and routing tools would constitute a work by itself.
3.1.2 Pointwise design simulation
For electrical circuits, the established technique for pointwise simulation (i.e.
with deterministic rather than stochastic inputs) requires simulation models
that can be interfaced to other circuits and understood by a circuit simulator.
It is desirable to maintain this methodology for the (co)simulation of optical
interconnect. For the design of individual complex OE-VLSI components, it is
generally beneficial to model all physical aspects in detail, resulting in models
which involve very detailed simulators using partial differential equations
(PDEs). These kind of simulations are very accurate, but require finite-element
spatial discretization, which is far more time-consuming than the simulation
of traditional electronic circuit components.
Hence, we have researched simulation models that trade in some precision
for simulation time, and that could be emdedded into a traditional circuit-
level simulator. This involves finding a way to introduce optics into circuit
simulations, providing compatible interfaces between involved components
and dealing with IP protected models for driver and receiver circuits. Section
3.3 provides a detailed treatment of this subject.
3.1.3 Modeling uncertainty
For the extraction of system-level properties related to timing and reliability,
pointwise simulations alone do not suffice. We have to take various sources
of uncertainty into account. These sources belong to three categories: man-
ufacturing process variations, operational deviations (e.g., optical coupling
58 Design automation issues
misalignment) and random noise processes. For each contribution to uncer-
tainty, a probability distribution can be assumed. Estimating the effect of
different random contributions can be done in two ways:
• Using Monte Carlo simulations of the circuit-level models, when point-
wise simulations are fast enough
• By mathematical modeling of the distributions involved and their com-
position
In this work, a combination of both techniques is used, with an emphasis on
the second method. It arguably provides more insight into the appearance
of obtained results. In doing this, we have moved away from circuit-level
models—and hence from general EDA tools—towards statistical descriptions
of relevant features. The work on uncertainty modeling is treated in the next
chapter.
3.1.4 Design space exploration
When an optical interconnect system is being conceived for a particular appli-
cation, a large number of design options are available for O-E devices, their
integration with CMOS and associated driving/receiving circuits, systems for
signal encoding and clock recovery, waveguides, connectors and packaging
methods yielding optically accessible chips. The choices to be made range
from decisions on overall architecture to the fine-tuning of parameters such as
the numerical aperture of a fiber.
Many choices have a profound impact on system-level properties of the
interconnect: feasibility, timing characteristics, reliability and production and
operating costs. Exploring the design space and making the right choices is
not a simple task, as multiple objectives are to be optimized simultaneously,
and the effect of individual choices on the combined system is not easily
quantified.
To alleviate the interconnect designer’s work, it would be useful to have a
systematic way of making these design choices, i.e., constructing a design
methodology for parallel guided-wave optical interconnects in an OE-VLSI sys-
tem. The eventual target is to formalize the result of this design methodology
development into a design tool. When the designer states some interconnect
requirements, this design environment should assist her in making decisions
on product and parameter choices. To achieve this goal, a methodology
development program comprising three stages has been envisaged (figure
3.1):
1. To begin with, the combined impact of individual link building block
variations on the properties of the complete interconnect system is
3.1 Methodological aspects 59
system-level link properties
product and parameter choices
VCSEL drive current
fiber numerical aperture
photodetector sensitivity
…
interface technology
STAGE 1
predict
STAGE 3
target
STAGE 2
construct
multi-objective solution
power dissipation
link skew
link latency
…
link reliability
implementation cost
us
 Pareto-optimal fron
e
t
for
 mp em nti g stage
 3
i l e n
Pareto curve
(e.g.) energy per bit
(e.
g.
) r
e
ci
pr
o
ca
l b
a
n
dw
id
th
sub-optimal
(dominated)
designs:
DISCARD
infeasible
designs
-Pareto optim
al
 
 
 front
Figure 3.1: Proposed design methodology development in three stages
investigated. This comprises the development of tools to predict how
system-level link properties will change when certain parameters are
adapted.
2. In a second stage, the design space of optical links is searched for setups
that yield favorable system-level properties. The Pareto-optimal front
is constructed: this is the remaining set of setups after discarding those
that are uniformly worse than others with respect to the system-level
properties. Essentially, this front will result in a multi-objective solution
for optical links, in such a way that, for each solution, no system-level
property can be improved without making some other property worse.
3. In a third stage, using these front data, a design tool is developed that
helps a system designer attain the optimal trade-off between system-level
properties, after a design-specific specification of boundary conditions
and a global cost function for these properties. The tool searches for
the system-level properties corresponding to this optimal trade-off and
maps them onto a design setup expected to yield these properties. In
essence, this is the reverse mapping of what is done in stage 1.
This work was originally planned in 2002. Somewhat later, fellow researchers
from the Lyon Nanotechnology Institute (INL) had realized appropriate EDA
support for the automated optimization of optical links. We refer to Mieyeville
et al. [2004] and Tissafi-Drissi et al. [2004] for a discussion of their modeling
and optimization approach. It focuses on the automated determination of
optimal transistor dimensions for OE-VLSI driver and receiver circuits so as
to optimize the link power and bandwidth according to a given trade-off.
Our attention then shifted to component simulation issues. In a recent project
on intra-chip interconnect [PICMOS, 2006], we have worked together with
60 Design automation issues
INL on a framework for the systematic simulation-based predictive synthesis
of on-chip optical interconnect [O’Connor et al., 2007]. We have drawn on
the acquired modeling knowledge to develop and characterize a Verilog-
AMS simulation model for a µm-sized distributed Bragg reflector (DBR) laser
(research work of fellow Ph.D. student Joris Van Campenhout). Appendix B.2.2
shows the Verilog-AMS implementation of this model; for more information
we refer to O’Connor et al. [2006c].
The following sections give a more detailed representation of the actual
achievements and concepts.
3.2 Optical link building blocks at the CMOS level
The first occurrence of physical inter-chip interconnect in the design flow is
the instantiation of the external interface when designing an IC. For traditional
electrical interconnect this is the ring of bond pads. In an OE-VLSI system this
is the area on which electro-optical conversion components are to be flip-chip
mounted and where driver and receiver circuits are located (see figure 3.2).
The CMOS-related data of the area underneath the O-E arrays can be delivered
to an IC designer as a design kit library with cells having a symbolic view, a
layout view and a digital and analog simulation view. The difference with
traditional electrical perimeter I/O is the availability of I/O provisions and use
of flip-chip pads at the center of the IC (these pads are clearly visible on figure
2.17(b) on page 48). However, as surface-based electrical interconnect also
makes use of the very same ingredients not much supplementary complexity
is involved for the introduction of OE-VLSI related CMOS cell libraries. The
electrostatic discharge (ESD) protection problem associated with surface-level
I/O pads is even reduced in the OE-VLSI case: in a packaged OE-VLSI module,
all pads are already connected to laser and photodiode arrays, whereas general
surface-level electrical I/O pads are still susceptible to ESD in an unmounted
package.
To protect the intellectual property (IP) of the interfacing circuits, the layout
view can be reduced—much in the same way as is done in standard cell
libraries—and the analog simulation model can be delivered as a precompiled
netlist or a simplified behavioral model. In the latter case, driver and receiver
circuits are broken down into parts, the behavior of which can be approxi-
mated by a simple description in a behavioral language like Verilog-AMS or
VHDL-AMS. In section 3.3.2, this modeling is discussed.
3.3 Design simulation
The most important circuit simulators are the industry-standard analog simu-
lator SPICE and the mixed-signal extensions to the two major hardware de-
3.3 Design simulation 61
receiver
array
driver
array
top half I/O ports
bottom half I/O ports
power
power
configuration
power
power
configuration
Figure 3.2: DTA IC in 0.35–µm CMOS. The photodiode and VCSEL arrays are to be
flip-chip mounted onto the marked areas. External circuits and power connect to the
driver and receiver circuits through virtual ‘pins’ at the boundary of the areas.
62 Design automation issues
outFin
TX VC PD RX
input
driver output
VCSEL output
photodiode output
receiver output
Figure 3.3: Sample transient simulation of an electrical-optical-electrical link. In this
figure, the VCSEL model is instantiated with parameters exaggerating the second-order
oscillatory response. The time-of-flight is set to zero.
scription languages Verilog-AMS and VHDL-AMS. The latter two languages
are to be preferred over SPICE for optical link modeling as they provide
a much better readable model description, although the intrinsic power of
expression of all three languages is quite similar (for analog simulations). The
analog and mixed-signal (AMS) languages allow for the direct expression of
differential equations, and natively support other physical units than voltage,
electrical current and time. A detailed overview of the differences between
Verilog-AMS and VHDL-AMS, differences between both languages and imple-
mentation intricacies (in this case concentrating on the modeling of an airbag
system) has been made by Pêcheux et al. [2005].
We have implemented circuit-level simulation models for the most important
link components in Verilog-AMS and integrated them into a simulation frame-
work (which concretized to Cadence Virtuoso DFII). The different simulation
models can be chained together to simulate optical interconnect (figure 3.3
shows a simulation example). In the next section, some intricacies associated
with circuit-level modeling of optoelectronic systems are touched. Thereafter
our model implementations are discussed.
3.3 Design simulation 63
3.3.1 Introducing photonics in circuit simulators
3.3.1.1 Spatially independent models
The determining factor for the high simulation speed attained by electrical
circuit simulators is the limited modeling freedom. Here, models for dynamic
systems are described with systems of ordinary differential equations (ODEs)
containing only time derivatives. Explicit spatial dependencies are forbidden
as these require PDEs and time-consuming finite-element solvers. The result-
ing disadvantage is a reduced accuracy as spatially distributed quantities (e.g.,
photon or carrier distributions in a laser) are necessarily treated in a lumped
way. To mitigate this problem, such distributed quantities are often expressed
as linear combinations of fixed spatial functions, the coefficients of which are
then treated as basic time-dependent node values. Section 3.3.3.2 focuses on
this aspect.
3.3.1.2 How to dene light
The analog-enriched hardware description languages (HDL) VHDL-AMS and
Verilog-AMS explicitly support quantities other than voltages and currents.
Different components can be connected through interface ports involving
quantities of virtually any physical unit. There are only two ways to attribute
a quantity to an interface port: either as a potential at the port, or as a flow
through the port. This viewpoint is very suitable for a number of systems; for
instance, a thermal system can be well described with ports and nodes having
temperature as the potential and heat transfer (= power) as the flow.
For light, two choices have to be made: one about the dimension of light, and
one about its classification as a potential or a flow.
The ‘dimension’ of light The petahertz-range oscillatory nature of light
considered as a form of electromagnetic radiation is obviously much too
detailed to be efficient in a time-domain simulation. Furthermore, as discussed
above, spatial variability cannot be handled except for a breakdown in one
‘lumped’ or a few nodes or interface ports (e.g., one port per mode or per
wavelength). For outbound or incoming light, a good choice of dimension
would be the instantaneous power of the beam (or part of it: mode, wavelength,
. . . ), which is a well-acquainted dimension in the electrical domain as well.
A potential or a flow? At first sight, light should be naturally classified as a
flow (an intensive nature). There is a problem though: Kirchhoff’s Current
Law [1845]—enforced by circuit simulators on flow branches—does not apply
to optics. Light can be absorbed in media, and photons traveling in opposite
senses through a waveguide do not compensate for each other like opposite
currents in a wire. For that reason, it is easier to represent the optical power
64 Design automation issues
as a potential value at an interface port (an extensive nature), and explicitly
provide for light propagation in simulation model descriptions. Also note
that bidirectionally used waveguides require separate interface ports for each
direction.
3.3.2 IP protection of interfacing circuits
VCSEL driver circuits and photodetector receiver circuits are (normal) CMOS
designs, hence their internal operation can be natively simulated by electrical
simulation tools given accurate transistor models.
However, the detailed circuit diagrams are usually not available, as circuit
design corporations want to protect their intellectual property. As mentioned
before, one way to tackle this is to deliver the analog simulation model as a
precompiled netlist (with due encryption). Another solution is the provision
of a simplified behavioral model. This maintains the open accessibility of
simulation models, yet with the trade-off of a less accurate simulation than
can be provided by a transistor-level model.
Figure 3.4 shows a way to represent the driver and receiver circuit of the IO
project as a simplified behavioral model. This methodology is applicable to a
broad range of driver and receiver circuits. The simplified behavioral model
can be looked upon as a parameterized block diagram (such as figure 3.5).
In this block diagram, the different blocks represent circuit subparts that can
be described by equations, the complexity of which is coarse enough to not
reveal the exact circuit design, yet detailed enough for use in a link simula-
tion. In practice, the designer of these circuits in the IO project [Helix AG
semiconductors] represented the behavior of the building blocks as a circuit
of linear electrical components (resistors, capacitors, inductors), transmission
lines, amplifiers, controllable switches and current/voltage limiters. These
elements should suffice to represent the behavior of the great majority of
driver or receiver circuits.
The Verilog-AMS code for the behavioral driver/receiver circuits can be found
in appendix B.1.
Models and parameters in general The actual CMOS design does not re-
quire parameter extraction effort as the characterization of the CMOS prim-
itives being used is the responsibility of the CMOS foundry. Usually this
information is included in the design kit distribution for the technology. The
appearance and parameters of the simplified behavioral model is obtainable by
a managable topological generalization of the design and simulation matching
with the transistor-level description.
We should mention that the reverse mapping—simplified model to transistor-
level description—is not as easy, especially when the performance is pushed
to its limit. Hence, the simplification can be considered a real protection of
3.3 Design simulation 65
Parameter Description
Cintx driver input capacitance
ts time constant of current switch
IMOD modulation current
IBIAS bias current
Cd driver output capacitance
qCd driver output capacitance proportion to ground
Rd driver output resistance
(a) driver model
Parameter Description
Cinrx receiver input capacitance
qCinrx receiver input capacitance proportion to ground
Cpd photodiode capacitance
A preamplier gain
fa preamplier bandwidth
A2 postamplier gain
f2 postamplier bandwidth
frollon roll-on corner frequency
Q quality factor
tdelay intrinsic latency
(b) receiver model
Figure 3.4: Schematic representation and parameters of the actual behavioral driver and
receiver circuit model (courtesy of Helix AG semiconductors). The colored background
agrees with the building blocks of figure 3.5.
66 Design automation issues
transimpedance amplifier postamplifier
decision circuitlimiting amplifier
equalizerphotocurrent input
digital output
Figure 3.5: Building blocks of a photodiode receiver that can be easily modeled in an
analog HDL language
substrate
p-type DBR
n-type DBR
anode
emission
cathodecathode
active region
oxide
region containing
S photons
region containing
N electron-hole pairs
Rs
Ra
Ca
Figure 3.6: Simple VCSEL model with lumped carrier and photon distributions. The
electrical parasitics circuit is indicated.
circuit IP. We refer to Tissafi-Drissi et al. [2004] for the discussion of a platform
addressing the automated dimensioning of interfacing circuits in a whole-link
approach. We furthermore refer to Sialm et al. [2006] for a driver transistor
sizing methodology.
3.3.3 VCSEL model
3.3.3.1 Elementary lumped model
The simplest practicable simulation models for O-E devices are obtained by
treating all spatially distributed phenomena as if they exist uniformly inside a
volume. Such lumped models are necessarily less accurate, yet they can be
more easily characterized than sophisticated models.
Model equations From a lumped-modeling perspective, the essential quan-
tities inside a VCSEL are the number of electron-hole pairs N in the active
region and the number of photons S confined by the DBR mirrors and the
3.3 Design simulation 67
oxide aperture (this is indicated on figure 3.6). Given the reduced reflectivity
of the bottom mirror, the outbound optical power L is given by
L =
αm · vg · S · hν
l
, (3.1)
(see, e.g., [Jungo, 2003]; we have adapted the notation for simplicity—
regrettably there is no established canonical notation). Here, αm is the
transmission coefficient of the mirror (dimensionless), vg the average photon
velocity (m/s), l the average length of a cavity roundtrip (m) and hν the energy
of one photon (J).
The basic electrical parasitics circuit is indicated on figure 3.6 as well, with
Ra and Ca the parasitics of the active region, and Rs the series resistance
established by both DBR mirrors and the metal-semiconductor contacts. The
dotted diode shape indicates the effectively injected current I in the active
region; the voltage across this ‘intrinsic’ diode can be well approximated
[Jungo, 2003] using an adaptation of the well-known Shockley [1950] relation:
V =
nkT
q
ln
(
N
Ne
+ 1
)
(3.2)
Here, kT/q is the thermal voltage, and n and Ne are fitting parameters.
The dynamic process between carriers and photons has been described for
the first time by Statz and deMars [1960] with several later improvements; see
Jungo [2003]; Zhang et al. [2004]; Yu [2003]; Li and Iga [2002] for an overview.
The quantities N (t) and S (t) are interpreted as reservoirs subject to several
mechanisms influencing their rate of change. A detailed treatment of the
contributions in these rate equations is outside the scope of this text. Suitably
approximated, the rate equations can be written as
dN (t)
dt
= ηi
I (t)
q
− N (t)
τN
− GN (t)− Ntr
1+ eS (t)
S (t) (3.3)
dS (t)
dt
= β
N (t)
τN
− S (t)
τS
+ G
N (t)− Ntr
1+ eS (t)
S (t) (3.4)
In the right-hand side of both equations, the first term stands for the basic
augmenting mechanism: for N, carriers injected from the outside (with an
efficiency factor ηi; q is the electron charge) and for S, spontaneous emission
resulting from recombining carriers (with a probability factor β). The second
term reflects recombination and loss mechanisms for carriers and photons,
with respective effective time constants. The third term is the mechanism on
which the laser operation relies: stimulated emission—the recombination of
a hole and an electron when a photon passes, yielding another photon with
identical characteristics. The specific rate with which stimulated emission
occurs is proportional to the photon density and the logarithm of the carrier
68 Design automation issues
density, and saturates at elevated photon densities. The expression used in
equations 3.3–3.4 is an approximation linearized for N (t), where Ntr is the
carrier density which must be surpassed to achieve lasing operation, G is a
proportionality factor and e establishes saturation of stimulated emission at
elevated photon densities.
Steady-state behavior A steady-state solution for equations 3.3–3.4 can be
analytically obtained by setting the left-hand side twice to zero and solving
for N and S. The result for S is required to derive L according to equation 3.1;
the result for N is not that important in this context. After the elimination of
N, a quadratic equation for S appears:
GτN + e
τS︸       ︷︷       ︸
a
·S2 +
[
1
τS
+ (1− β)GNtr − ηi Iq (GτN + βe)
]
︸                                                 ︷︷                                                 ︸
b
·S − ηi I
q
β︸     ︷︷     ︸
c
= 0 (3.5)
The only physically possible solution is the positive root:
S(I) =
−b+ √b2 − 4ac
2a
(3.6)
It turns out that generally 4ac is numerically much smaller than b2. As long as
b is positive, S will be negligible. Only above a certain threshold current Ith—at
which b becomes zero—will S become sizeable. S will then be approximately
equal to −b/a and turn out to be proportional to I − Ith. This gives rise to the
following archetypal steady-state model:
L (I) ≈
{
η (I − Ith) if I > Ith
0 otherwise
(3.7)
with these parameters:
Ith =
q
ηi
·
1
τS
+ (1− β)GNtr
GτN + βe
(3.8)
η =
αm · vg · hν
l
· ηi
q
· GτN + βe
GτN + e
(3.9)
Although Ith and η depend on a good number of material and construction
related parameters, their extraction from measurements is very simple: a trend
line is fitted to LI-measurements above threshold; the intercept and slope of
the trend line yield the sought parameters. Self-heating of the VCSEL disrupts
the linear relation at elevated drive currents (this is discussed below). This
simple model will be applied in section 4.2.2.1 (page 86); LI-curves are also
shown there.
3.3 Design simulation 69
Small-signal behavior The small-signal behavior of a VCSEL describes how
small (in practice less than about 10%) dynamic deviations of the current
around an operating point I0 translate to changes of the optical emission. In
this case, deviations of all dynamic quantities can be expected to be small:
I (t) = I0 + ∆I (t) (3.10)
N (t) = N0 + ∆N (t) (3.11)
S (t) = S0 + ∆S (t) (3.12)
In this case, equations 3.3–3.4 can be linearized around the operating point:
d∆N (t)
dt
≈ ηi∆I (t)
q
+
[
− 1
τN
− GS0
1+ eS0
]
∆N (t)− G (N0 − Ntr)
(1+ eS0)
2 ∆S (t)
(3.13)
d∆S (t)
dt
≈
[
β
τN
+
GS0
1+ eS0
]
∆N (t) +
[
− 1
τS
+
G (N0 − Ntr)
(1+ eS0)
2
]
∆S (t) (3.14)
In order to eliminate ∆N (t) and d∆N (t) /dt from this linear differential
equation system, equation 3.14 needs to be differentiated once more; after
the elimination a second-order linear ordinary differential equation in ∆S (t)
arises (which we omit here owing to notational complexity). ∆S (t) will exhibit
a (typically underdamped) second-order response to ∆I (t). The intrinsic
modulation transfer function can be written as
H (ω) = H0
ω2R
ω2R + jωγ−ω2
, (3.15)
where H0 is a scale factor, ωR is the relaxation oscillation frequency and γ
the damping coefficient, all of them dependent on the operating point (closed
forms for these parameters are very lengthy but easily obtained with symbolic
mathematical manipulation software; we omit them here).
The complete small-signal VCSEL response can be regarded as a composition
of the parasitic elements and this intrinsic response.
For characterization purposes, a vector network analyzer can be connected
with one port to a current-biased test VCSEL and with the other port to a
reverse-biased fast photodiode optically coupled to the VCSEL. It performs a
frequency sweep of the response to a small sinusoidal superimposed VCSEL
current, measuring the magnitude and phase of the ac component of both
the optical output (S21 parameter) and the VCSEL voltage (S11 parameter).
The S11 response at different bias currents can be used (in combination with a
steady-state voltage sweep) to extract element values for the parasitics circuit
(as in figure 3.7(a)). The S21 response can be corrected for these parasitics,
yielding a measurement of only the intrinsic response, to which equation
70 Design automation issues
anode
cathode
Rs
15 Ω
Ra50 Ω
Ca0.45 pF
-15
-12
-9
-6
-3
0
3
6
9
0 1 2 3 4 5 6 7 8 9 10
S 2
1
re
sp
on
se
 (d
B)
frequency (GHz)
(a) parasitics circuit (b) intrinsic S21 response
Figure 3.7: (a) Simplified small-signal VCSEL parasitics (12–µm oxide aperture) cir-
cuit at 5mA extracted from S11 measurements (courtesy of Avalon Photonics). The
incremental resistance of the intrinsic diode is attributed to Ra here. (b) Measured
and fitted normalized intrinsic S21 response for a single VCSEL at 5mA. The fitting
parameters are ωR = 4.24GHz and γ = 12.97GHz.
3.15 can be fitted. Figure 3.7(b) shows a good fit of the intrinsic modulation
transfer function of the VCSELs from the IO project at 5mA bias.
The Master’s Thesis of De Clerck [2004] builds upon this modeling approach
and addresses the lumped-model characterization of VCSELs (specifically, the
VCSELs of the 8× 8 arrays used here) in more detail. We additionally refer to
Sialm et al. [2005] for a comprehensive discussion on how to model a VCSEL
using only steady-state and modulation transfer function measurements.
3.3.3.2 Advanced modeling aspects
Temperature The sizeable power conversion and associated dissipation oc-
curring in VCSELs, combined with the high thermal resistance of the semi-
conductor materials used in their construction, gives rise to a significant
temperature increase in the cavity during continuous-wave operation. The
elevated temperature of high-performance CMOS obviously contributes to the
temperature increase as well.
The behavior of a VCSEL is dependent on this temperature inside the device:
the peak of the gain spectrum inside the quantum wells shifts to longer wave-
lengths with increasing temperature, and the emission wavelength increases
slightly as well [Nakwaski, 1996]. Furthermore, at elevated temperatures,
carriers can leak out of the active region [Scott et al., 1993].
Although very detailed simulation of the temperature distribution in a VCSEL
is feasible (see, e.g., [Sadi et al., 2006]), a lumped model is more suitable for a
behavioral simulation for reasons of efficiency. Figure 3.8 shows the archetypal
thermal model used for this purpose [Bewtra et al., 1995].
The impact of the temperature on the behavior is typically accounted for by
making the gain constant G and the transparency carrier count Ntr functions of
3.3 Design simulation 71
Ptherm
Tinternal
Rtherm
Tambient
Ctherm
Figure 3.8: Simple thermal model adopted for the circuit-level modeling of a VCSEL,
displayed as an equivalent electrical network.
the temperature, and by including a leakage current accounting for thermally-
induced carrier loss [Mena et al., 1999].
The temperature-dependent characterization of the IO demonstrator VCSELs
has been treated in [De Clerck, 2004]. Furthermore, in appendix B.2.2, a fully
characterized laser model is shown which takes thermally dependent gain
and carrier densities into account. For our full description of this model we
refer to O’Connor et al. [2006c].
Multiple modes and spatially distributed carriers We have already men-
tioned above that the limited accuracy of a lumped model can be somewhat
improved by expressing distributed quantities as a linear combination of fixed
spatial functions, the coefficients of which are then treated as basic time-
dependent node values. Such an approach has been extensively researched
for VCSELs [Valle et al., 1995; Morikuni et al., 1999; Mena et al., 1999; Jungo,
2003; Zhang et al., 2004].
The method is as follows. Instead of lumped photon and carrier amounts, the
spatial distributions of the concentrations in the active region are considered.
In this context, ‘spatial’ is actually two-dimensional: as the active layer is very
thin, a lumped approach can still be applied in the longitudinal direction; it
suffices to model the concentrations in a transverse plane. The photon and
carrier concentrations are expressed as excitations of a limited set of basis
functions:
• With respect to the photon concentration, the different optical modes
of the VCSEL are estimated (e.g., linearly polarized modes or Laguerre-
Gaussian modes as suggested by Mena et al. [1999]) or calculated first
(e.g., using one of the approaches discussed by Bienstman et al. [2001]).
The normalized intensity profiles of the modes in the transverse plane
are then treated as basis functions. The number of modes required for an
accurate simulation depends on the VCSEL diameter and the maximal
operating current.
• With respect to the carrier concentration, orthogonal basis functions of
an inner product space over spatial distributions are considered. Here,
72 Design automation issues
the inner product of two spatial distributions is the integration of their
pointwise product over the plane. Generally only radially symmetric
distributions are considered; a Bessel series expansion of the radial
profile is then an appropriate choice [Moriki et al., 1988].
When spatially dependent rate equations are considered, after some accept-
able simplifications—such as the linearization of the carrier density discussed
above—spatial and temporal integrations can be separated. When the series
expansions of carrier and photon concentrations are introduced, all spatial
integrations can be performed statically, yielding a system of ordinary dif-
ferential equations suitable for circuit-level simulation. The choice of mode
intensity profiles as basis functions for the photon concentration additionally
allows evaluating the coupling efficiency of the beam corresponding to each
mode with an optical fiber [Gholami et al., 2006a,b].
Appendix B.2.1 shows our Verilog-AMS implementation of the model dis-
cussed by Mena et al. [1999]. Although spatiotemporal VCSEL models can be
very accurate, detailed characterization is very involved. For instance, for the
lumped model the optical output can be measured using a photodiode; the
multimode model conversely requires the measurement of the distribution of
the optical output pattern over multiple modes; establishing the approximate
profiles of these modes is not trivial either, as it requires dedicated simulators
and/or specialty measurement techniques. Additionally, if the temperature
dependency is to be taken into account, the complexity increases severely.
Due to limitations of measurement methods available to us, we did not charac-
terize this model for the VCSELS of the IO project. We refer to the concluding
observations of Jungo [2003] for suggestions on a line of attack.
3.3.3.3 Implementation issues
Limitations concerning the order of magnitude of values A problem for
VCSEL descriptions in AMS simulators concerns awkward orders of magni-
tude of dynamic quantities from an electronics viewpoint. In normal use, the
range of values covered by voltages (V) and currents (A) is roughly comprised
between 10−12 and 109. Smaller values are assumed to be zero by the simula-
tor, whereas larger values yield a ‘blowup’ error, to indicate that something is
wrong—either a connectivity mistake or a convergence problem.
Nevertheless, the photon and carrier densities in a laser, expressed in m−3,
can readily exceed 1020. To avoid the above problems, the blowup limit and
the absolute tolerance can be specified when a new nature (dimension) is
declared. However, we noticed that in our Verilog-AMS implementation time
derivatives or integrals of such quantities are again assumed to be in the range
expected for voltage and current changes, without recourse (there is a way to
indicate the dimension and associated ranges of time derivatives and integrals,
yet this seems not widely respected).
3.3 Design simulation 73
current (A)
O
pt
ic
a
l o
u
tp
u
t p
ow
e
r 
(W
)
around the threshold:
simulator tends to “hop over”
noitulos
 etats
-ydaets
 tcerroc
noitulos
 evitagen
 tcerrocni
Figure 3.9: Two solutions to the VCSEL rate equations in steady-state. The physically
impossible solution can confuse the simulator if appropriate measures are not taken.
The straightforward workaround, which we have applied, is the introduction
of scale factors until all quantities are in the accepted range; however this
measure harms the readability of the code.
Convergence issues The positive-only nature of carrier and photon densi-
ties/counts can give rise to simulation problems. For a steady-state simulation
or an initial operating point simulation, an initially attempted approach of the
simulator is assuming that all values of time derivatives and integrands of
time integrals are zero, yielding a regular system of equations which is solved
using the Newton-Raphson algorithm. While this method generally works, it
does not take the choice for the positive root in equation 3.6 into account and
can therefore converge to a physically impossible negative output power, as
indicated in figure 3.9. In a steady-state sweep where the previous solution is
used as a starting point for the next time step, this problem will occur around
the threshold current.
A possible solution [Mena et al., 1999] is to represent carrier and photon
densities as the square of another quantity. This can however lead to other
simulation convergence problems due to the ambiguity of the sign of the
quantity being squared.
The time step of the transient simulator is best limited (to about 20ps) as
well to avoid a scenario where the VCSEL current abruptly crosses the thresh-
old. The natural continuation of the correct steady-state solution across the
threshold indeed always ends up with the incorrect solution (as seen in fig-
74 Design automation issues
anode
cathode
Rs
30 Ω
Cdep
472 fFI=RL+Id
Ls
22 pH
Figure 3.10: Simple photodiode circuit model (characterization courtesy of Albis
Optoelectronics). Cdep is the depletion capacitance, the other elements are contact and
wiring parasitics.
ure 3.9); the limitation of the time step allows the derivative of the photon
density/count to change more gradually.
3.3.4 Other models
3.3.4.1 Photodiode
As discussed in section 2.3, a photodiode under reverse voltage bias converts
incident light (in a certain wavelength range) into a proportional electrical
current superposed onto the diode saturation current. Only this ‘dark current’
and the responsivity (slope) are relevant for the steady-state behavior. Both
are a function of the temperature, but self-heating effects are insignificant in
contrast with VCSELs. The values of responsivity and dark current and the
thermal dependencies are treated in our uniformity analysis (section 4.4.1).
The dynamic response is determined by the combination of the depletion
capacitance of the reverse-biased photodiode, some contact and wiring para-
sitics (figure 3.10), and the transit time of carriers depending on the location
of impact of photons in the intrinsic region. In the IO setup, the RC circuit
consisting of the depletion capacitance, the series resistance and the input
impedance of the CMOS receiver dominates carrier diffusion and the simple
model of figure 3.10 suffices. There are published circuit-level modeling solu-
tions in case the transit time is sizeable; for this we refer to Zhang and Conn
[1992] for a basic state-space model and to Jou et al. [2002]; Wang et al. [2003]
for a circuit-level implementation thereof.
3.3.4.2 Optical path
The circuit-level model for the optical path has to connect outgoing mode
excitations of the modeled VCSELs to the optical input terminal of the modeled
photodiodes. The propagation of each VCSEL mode profile through the
entire optical path can be calculated independently of all other VCSEL modes
(different modes are not mutually coherent). At the input port of each modeled
3.3 Design simulation 75
in<1>
in<2>
in<3>
in<4>
out<1>
out<2>
out<3>
out<4>
optical
path
Figure 3.11: Example of a ‘circuit schematic’ for optical link simulation. The optical
path model here should connect linear combinations of its (delayed and possibly
dispersed) inputs to its outputs.
JTAG controller
I
O
BSC
BSC BSC
BSC
O
B
S
C
B
S
C
B
S
C
B
S
C
B
S
C
B
S
C
I O IIG
P
P G P
G
I
O
JJ J J
P
GPOP G
G
internal circuits
(a) Schematic representation of the normal position of JTAG test circuits on a CMOS
die
TX
BSC
TX
BSC
TX
BSC
TX
BSC
TX
BSC
TX
BSC
TX
BSC
TX
BSC
TX
BSC
RX
BSC
RX
BSC
RX
BSC
RX
BSC
RX
BSC
RX
BSC
RX
BSC
RX
BSC
RX
BSC
(b) Driver and receiver arrays can be equipped with similar test circuits as well
Figure 3.12: Boundary-scan cells (BSC) are normally put in between the electrical pad
ring and the internal circuits, daisy-chained and connected to a controller circuit. In
the same way, the boundary scan cells can be positioned in between internal logic and
VCSEL driver arrays or photodiode receiver arrays.
76 Design automation issues
photodiode, the contributions of each propagated VCSEL mode can therefore
simply be added.
Propagating optical signals are subject to coupling losses and crosstalk, mate-
rial absorption, losses in tight fiber bends (when using POF), time-of-flight
delay and dispersion. Loss factors and crosstalk can be statically estimated;
for coupling losses we refer to section 4.3. When dispersion is negligible (as
when using graded-index fiber), a simple time delay suffices to model the
temporal behavior. When this is not the case (as when using step-index fiber or
integrated waveguides) advanced dispersion modeling is called for. Regarding
this subject we refer to Gerling [2005] who models dispersion as a filter in a
circuit simulator; the step response of the waveguide is hereby approximated
as a sum of a few exponentials, resulting in a compact representation in analog
HDL languages.
3.4 Design for testability of optical links
In pad cells for electrical interconnect special boundary scan cells are typically
used to enable testing of chip logic and PCB interconnect. Driven by a
standardized low-overhead protocol called the Joint Test Action Group (JTAG)
standard [IEEE Std 1149.1, 2001], boundary scan hardware provides a way
to observe the value driven to any JTAG-equipped output or present at any
equipped input, and it can be used to override the actual output state or the
internally observed input value (the idea is rendered in figure 3.12). This
system can also be used to observe/enforce the state of equipped internal
flip-flops or to provide configuration information.
In the IO demonstrator the driver and receiver circuits have been equipped
with boundary scan cells for configuration and testing purposes. Since an
automatic gain control in the receiver circuits requires the optical signal to be
dc-balanced, a test signal which alternates with the test clock is used during
interconnection testing (this is called AC-JTAG); this approach is also required
for electrical interconnect to test ac-coupled differential pairs. The AC-JTAG
used in the DTA IC is a customized extension of IEEE Std 1149.1 [2001]; in the
mean time a similar approach has been standardized by IEEE Std 1149.6 [2003].
Using this approach, testing of optical interconnect can use the same hardware
and follow the same procedure as the testing of electrical interconnect.
The JTAG provisions in the DTA IC have been used extensively to verify the
CMOS integrity after flip-chip bumping; to verify the O-E device connections
after the completed flip-chip bonding; to test the operation of a parallel optical
link (figure 3.13); and for all kinds of configuration and diagnostic purposes.
3.4 Design for testability of optical links 77
Figure 3.13: Illustrative example of what is possible with JTAG support. Here, an
experimental optical cable using 8× 8 connectors with 4× 8 mounted fibers has been
plugged in between two DTA ICs. The JTAG provisions are used to discover the
attained interconnectivity (top screenshot) and to discover potential crosstalk issues
(off-grid positions in the bottom screenshot). The interconnection test takes less than a
second; the crosstalk test takes about a minute.
78 Design automation issues
3.5 Summary
In this chapter, we have discussed a number of methodological issues con-
cerning EDA support for OE-VLSI systems. The design of OE-VLSI systems
consists of several steps, each needing some form of EDA support. We have
presented a breakdown of fundamental design automation support—design
creation, simulation, extraction of properties and design space exploration—
the methodologies involved and references to relevant realizations.
Circuit-level simulation models for the most important link components have
been implemented in the mixed-signal language Verilog-AMS and integrated
into a simulation framework (which concretized to Cadence Virtuoso DFII).
The different simulation models can be chained together to simulate optical
interconnect. The intricacies concerning the integration of multidisciplinary—
electrical, optical, thermal—and sometimes badly conditioned differential
equations—e.g., laser rate equations—into a simulation system bearing the
marks of an electrical circuit dedication have been discussed as well.
The circuit-level modeling work performed lies at the basis of two Master’s
theses [De Clerck, 2004; Bhatti, 2006] on the characterization and tuning of OE-
VLSI components. Furthermore, in the context of the PICMOS [2006] project
on intra-chip interconnect, a DBR microlaser simulation model has been
implemented in Verilog-AMS [O’Connor et al., 2006c]. This was a joint effort
with the Heterogeneous Microelectronic Design group of the Lyon Nanotechnology
Institute on the systematic simulation-based predictive synthesis of integrated
optical interconnect [O’Connor et al., 2007].
Chapter 4
Statistical OE-VLSI modeling
and characterization
4.1 Introduction
In this chapter, we thoroughly examine the connection between the many
efficiency and accuracy aspects of a sophisticated OE-VLSI assembly and
resulting across-the-board optical interconnect characteristics such as overall
signal attenuation and offset, latency, jitter, noise and ensuing bit error ratio
(BER) figures.
4.1.1 Statistical modeling and characterization approach
A modeling and characterization approach of a necessarily strong statistical
nature is adopted. The reason for this is as follows. The composition of
many different components constituting an OE-VLSI system results in all-
embracing stochastic variables (such as a latency figure) being determined
by a sizeable number of independent uncertainties: various efficiencies and
accuracies of CMOS circuits, O-E conversion devices, optical and mechanical
components. As the odds of a collective extreme statistical outcome exponen-
tially decrease with the number of independent contributions, a conventional
worst/normal/best case examination of all contributions would yield an
unduly broad-ranging aggregate result in this case.
We present simple yet adequate stochastic models to capture the behavior
and uniformity of the different subsystems of the optical interconnect—driver
circuit, VCSEL, optical path, photodiode and receiver circuit. These models
specifically address amplitude, timing and noise behavior. The methodology
is generally applicable. For the purpose of quantitative analyses our models
are characterized based on detailed and extensive measurements of the actual
80 Statistical OE-VLSI modeling and characterization
performance on OE-VLSI hardware from the IO project (the converter boards
presented in appendix A). In order to capture as much information as possible,
this measurement approach is one of dividing the parallel optical link in as
many parts as possible—by cutting in between driver and laser arrays, in front
of and behind the fiber assembly, and in between photodiode and receiver
arrays.
4.1.2 Optical path characterization
A considerable effort is required to characterize the optical path in a statis-
tically relevant way, when having only a very limited number of properly
connectorized multi-fiber cables at one’s disposal. To this end, we have
built a measurement setup to accurately quantify VCSEL-to-fiber (and fiber-to-
photodiode) coupling efficiencies. An automatic 3-axis fiber-device positioning
setup has been realized with manual control of the remaining 3 (rotational)
degrees of freedom. With the appropriate dedicated programming, this system
is autonomously able to accurately scan the alignment-dependent coupled
power between all devices of an 8×8 VCSEL (or photodiode) array and a fiber.
Using conventional graded-index glass fiber, the dependence of the coupling
efficiency on the alignment in the IO demonstrator turns out to be well
apparent at the VCSEL side but much less so at the photodiode side. This is
understandable as an 80–µm photodiode size was initially chosen to uphold
support for fibers with a high numerical aperture and wider-emitting step
index fibers. Our characterization focus is therefore closer to the VCSEL side,
also because of the relatively larger contribution of VCSEL process variations
to overall nonuniformity.
The acquired measurements are used to characterize a simple ray tracing
model for the expected coupling efficiency on an 8×8 VCSEL array. A stochas-
tic model (taking process variations into account) for the alignment-conditional
power coupling between a VCSEL and a fiber is characterized as well.
When a fiber bundle is connectorized, the positions of the fiber facets are
mutually fixed relative to each other as well as relative to the alignment
features of the connector. When the connector is plugged into a package,
the misalignment of laser-fiber pairs at different positions in the array is
therefore strongly correlated. Translational mismatches cause only uniform
misalignments of laser-fiber pairs as long as the orientation of the fiber facet
array w.r.t. the optical array is constant. However, a rotation of the connector
in the plane of the optical array can introduce severely nonuniform radial
misalignment, and connector tilt causes the fiber facets to be positioned at
different working distances.
The modeling of alignment-inclusive array-wide transmitter-side optical power
coupling is addressed. One ingredient towards this modeling is the alignment-
conditional measurement of the coupled power mentioned above, which
4.1 Introduction 81
directly gives a picture of the variability of the amount of optical power and
the directionality of the emission across different VCSELs.
The other ingredient bears upon the uncertainty of the fiber bundle alignment.
Here we have built on the research of others. Approaches in the IO project
for the packaging of an optically equipped IC—in a way that a multi-fiber
connector can be plugged in—constituted a research subject of IO project
partner CEA-LETI [Marion et al., 2004] and the main work of fellow PhD
student Olivier Rits [Rits et al., 2004]. One of his research achievements is a
stochastic model for the misalignments during the packaging and the insertion
of a multi-fiber connector. We have used this process model to generate a
set of representative fiber bundle alignments and combine them with our
alignment-conditional characterization. A stochastic model for the array-wide
laser-fiber coupling has been developed and characterized as a result.
4.1.3 Global interconnect parallelism evaluation
By bringing all characterized component models together, we can achieve the
quantification of all-inclusive signal attenuation and offset, latency deviation,
jitter, noise and ensuing bit error ratio (BER) figures, not only including
the statistical evaluation of isolated optical channels, but also capturing the
statistical dependencies between different channels juxtapositioned within the
same array or package. The results obtained can be directly put in practice to
address the following subject.
As we have discussed in section 1.4, the main motivation for short-range optical
interconnect is the mitigation of bandwidth density issues. Two kinds of
bandwidth density issues could be distinguished: one of raw throughput and
an other where the link bandwidth determines the latency. The throughput
bottleneck occurs for instance when telecommunication links are brought
together for packet-switching or circuit-switching purposes. The latency
bottleneck is commonly occurring inside computational systems (see equation
1.1 on page 12 and the surrounding text). Depending on this distinction, the
relationship between the individual channels in a parallel interconnection is
dissimilar as well.
In long-haul telecommunications, signaling is inherently bit-serial. The dif-
ferent channels of multi-channel links typically represent independent data
streams, brought together for packet-switching or circuit-switching purposes.
The plesiochronous relation or randomly phase-shifted synchronous relation
between the streams necessitates separate clock recovery or resynchronization
in each channel.
In the other (computational-system) situation, it is often interesting to stripe
data packets over different channels, i.e., successive bits or bytes of a data
packet are allotted to different physical channels and simultaneously transmit-
ted. This assures a very low latency for the initial part of a data packet, which
could already suffice to start processing while the rest of the packet comes
82 Statistical OE-VLSI modeling and characterization
in. The different channels of an interconnection will now carry bit-parallel
clock-synchronous data streams. They have to be treated jointly, and the
timing skew between them has to be kept at an absolute minimum if an
unsophisticated reassembly of the data packet is desired. The uniformity of
the individual channels is key to the viability of different options.
In current, commercially available, short and medium range parallel optical
interconnect systems parallelism is modest, and the uniformity requirements
of the optical channels are still very relaxed. Timing recovery is performed
independently for each channel, tolerating very high skew. This is the case
even in a parallel optical system such as 12x InfiniBand [2000] using SNAP12
[2002] optical modules, where one data stream is striped over different parallel
channels to minimize packet latency. Furthermore, in virtually all current
optical receiver circuits, dc-balanced signal coding is used. The logic threshold
is independently set for each channel, tolerating widely different dynamic
signal ranges.
This kind of tolerance towards nonuniformity comes at a cost: clock recov-
ery circuits require significant chip real estate and power, and dc-balancing
schemes such as 8-b/10-b coding [Widmer and Franaszek, 1983] reduce band-
width and thus introduce extra latency. For OE-VLSI approaches these area
requirements become prohibitive due to the associated two-dimensional real
estate limitations [Venditti and Plant, 2003]. Large-scale parallelism in opti-
cal interconnects at short distances can only be viable when the individual
channels are of minimal complexity.
We therefore address the question as to whether channels of short, highly
parallel optical links can be made sufficiently uniform with respect to timing
and signal amplitude so that a parallel synchronous link can be made without
expensive per-channel adaptation circuitry. Significant gains can be realized if
this is indeed the case.
As for timing, if the inter-channel timing skew is low enough, a simple source-
synchronous signaling scheme could be deployed. In such case, one channel
is sacrificed to carry a source-synchronous clock, from which the sampling
instant for all other channels can be derived. This obviates the need for many
expensive independent clock resynchronization circuits. The feasibility of
source-synchronous optical signaling has already been demonstrated on a
minimal scale—over two optical channels [Gui et al., 2005].
As for signal amplitude, if channels are sufficiently uniform, one dc-balanced
signal—the same clock-dedicated channel—could be used to extract the logic
threshold for all other channels, eliminating the dc-balancing requirement
on these channels. Another approach to achieve this objective is to work
differentially [Venditti et al., 2004], but this has the disadvantage of halving
the achievable bandwidth.
If both uniformity objectives were reached, emerging low-latency high-
bandwidth I/O standards such as HyperTransport, Parallel RapidIO and
POS-PHY Level 4 could be directly applied over the parallel interconnect with
4.2 From chip to light 83
source-synchronous DDR schemes. We have to remark that the latest specifi-
cations of most such standards provide for per-channel phase adjustment and
often dc-balanced coding as well. This per-channel adjustment is motivated
by inter-channel skew problems to which also electrical interconnect is not
immune at all but the shortest link lengths. The dc-balanced coding permits
the use of ac-coupling approaches. However, the associated complexity and
area penalty remain.
4.1.4 Structure of this chapter
This chapter follows the natural flow of data through the interconnect with a
successive sections on the transmitter side, the optical path and the receiver
side. Building on the developed models, the question on the feasibility of
dc-unbalanced and true source-synchronous signaling is addressed in the final
section.
4.1.5 A note on notation
In this chapter, we consistently adopt the notation style used in probability
theory where random variables are written in upper case and variable values
as well as parameters are written in lower case. This may sometimes look
awkward—e.g., a capital H instead of the lower case η for to indicate a
quantum efficiency—yet it has the advantage of indicating the probability-
related status of the name without additional notational burden.
4.2 From chip to light
4.2.1 Drive current uniformity
The basic functionality of a VCSEL driving circuit is switching the laser current
between two levels i0 and i1 as dictated by a digital input. The on-current i1
determines the maximal optical output power. The off-current i0 should be
low for a good contrast, but still well above the VCSEL threshold current ith
to avoid turn-on delay and a slow dynamic laser response. The topology used
in the DTA IC is shown again in figure 4.1(a). The high-side p-FET T1 drives
the constant current i1 into the VCSEL and a shunt path, which is switchable
through n-FET T2, diverting the modulation current i1 − i0 = imod away from
the VCSEL using constant-current n-FET T3. When the shunt path is active,
the resulting VCSEL current thus becomes i1 − (i1 − i0) = i0, otherwise it
remains i1.
When considering arrays of driver circuits, process variations and device
mismatch may cause nonuniformity between drive currents. Let I1 (i1,aim)
84 Statistical OE-VLSI modeling and characterization
supply
T3
T2
T1
vmod
v1
data
1
2
…
nd
1
2
…
nd
driver
supply
T1
∝ i
·,aim
I
·
1
2
…
nd
driver
supply
T1
supply
T1
I
·
v
·
∝ i
·,aim
(a) (b) (c)
Figure 4.1: (a) VCSEL driver circuit. (b) Current based reference current distribution.
(c) Bias voltage based reference current distribution.
0.98
0.99
1.00
1.01
1.02
-0.02 -0.01 0.00 0.01 0.02
K
Θ (mA )0.5
Figure 4.2: Distribution of Θ and K, as fitted to supply current measurements over one
array
4.2 From chip to light 85
and Imod
(
imod,aim
)
represent random variables over different driver circuits,
yielding the actual drive current given the targeted current.
Bias voltages for all constant-current FETs need to be distributed over the array.
Figure 4.1(b) shows a current-based approach, where each driver receives its
own reference currents which are locally scaled to I1 (i1,aim) and Imod
(
imod,aim
)
.
This method is very accurate, but less practical for very large arrays due to
the point-to-point distribution. Bias voltages v1 (i1,aim) and vmod (i1,aim) are
globally distributed instead, as shown in figure 4.1(c) (we ignore mismatch
of the shared mirror-source MOSFET as this can be globally compensated
for). Due to MOSFET threshold and transconductance variations over larger
distances, the actual current will deviate from the requested current in both
approaches. Using the basic quadratic MOSFET model, this goes according to
I1 (i1,aim) ≈
{
K
(
Θ+
√
i1,aim
)2 if Θ > 0 or i1,aim > Θ2
0 otherwise,
(4.1)
The same model applies to Imod
(
imod,aim
)
. The random variables Θ and K
are proportional to respective mismatch of threshold and transconductance
between both sides of the current mirror. We have measured the actual
individual driver currents i1,d by selectively enabling the drivers and mea-
suring the change in supply current and extracted θd and kd each time by
fitting equation 4.1 to the measurement. Figure 4.2 shows the resulting
sample of the joint distribution of Θ and K. The slightly negative sam-
ple correlation coefficient of -0.23 is too close to zero to reject zero corre-
lation between Θ and K at a 95% confidence level; the correlation, if any,
should be very small. A good approximation for the squared coefficient of
variation (the square of CVI1 (i1,aim) = σI1 (i1,aim) /µI1 (i1,aim)), in the region
where i1,aim  σ2Θ ≈ 43nA, can be found by working out the variance of
ln (I1 (i1,aim)):
(
CVI1 (i1,aim)
)2 = σ2I1 (i1,aim)
µ2I1 (i1,aim)
≈ σ
2
K
µ2K
+ 4
σ2Θ(
µΘ +
√
i1,aim
)2 (4.2)
As µK ≈ 1 and µΘ ≈ 0, we can assume µI1 ≈ i1,aim and equation 4.2 reduces to
(
CVI1 (i1,aim)
)2 ≈ σ2I1 (i1,aim)
i21,aim
≈ σ2K + 4
σ2Θ
i1,aim
(4.3)
Figure 4.3 shows the measured coefficient of variation and the match according
to equation 4.3. Our model makes a slight overestimate of the measured CV.
The effect of MOSFET threshold variations is prominent, but only at currents
below the VCSEL threshold (around 1mA). Beyond this value, the relative
deviation quickly decreases below 1%. The on-current I1 (i1,aim) will therefore
be very uniform for any relevant configuration. Small settings for imod,aim,
86 Statistical OE-VLSI modeling and characterization
0.0
0.5
1.0
1.5
2.0
0 1 2 3 4 5 6 7 8 9 10
co
e
ffi
cie
nt
 o
f v
ar
ia
tio
n 
CV
(%
)
I 1
VCSEL current i (mA)1,aim
measurement
model
Figure 4.3: Coefficient of variation of VCSEL drive current I1 (i1,aim) over the array
0
1
2
3
4
5
6
7
8
0 1 2 3 4 5 6 7 8 9 10
o
pt
ica
l o
ut
pu
t p
ow
er
 L
(m
W
)
VCSEL current i (mA)
Figure 4.4: Measured light-current (LI) curves on a fully processed VCSEL array
however, will result in a nonuniform modulation current, which could be
improved if the more accurate current distribution method (figure 4.1(c)) is
used.
4.2.2 VCSEL array uniformity
4.2.2.1 Static uniformity
Typical threshold currents of the IO project VCSELs are in the order of 0.9mA,
the voltage drop at 5mA is 1.8V, and the wall-plug efficiency is 29.8%. Figure
4.4 shows measured light-current (LI) curves. A good fit of each measured
LI curve can be obtained with a simple linear stochastic model (already
4.2 From chip to light 87
0.630
0.635
0.640
0.645
0.650
0.655
0.660
0.665
0.670
0.675
0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4
sl
op
e 
ef
fic
ie
nc
y 
H 
(W
/A
)
threshold current I (mA)th
8 um
10 um
12 um
Figure 4.5: Variation of threshold and slope efficiency within VCSEL groups with
different oxide apertures (measurement courtesy of Avalon Photonics). The dashed
lines are 2σ bounds of jointly normal distributions with their parameters extracted
from the measurements.
deterministically introduced in section 3.3.3):
L (i) =
{
H (i− Ith) if i > Ith
0 otherwise,
(4.4)
where the threshold current Ith and slope efficiency H (capital η) are random
variables over different lasers. The VCSEL performance depends strongly
on the diameter of the active region. A small diameter typically results in a
small threshold current, but the voltage drop and series resistance is larger,
worsening the high-speed properties. A large diameter results in a large
threshold current, which implies a larger power dissipation. Samples of Ith
and H for VCSELs with different oxide apertures are shown in figure 4.5; a
clear positive correlation is observed. VCSELs with a 12-µm aperture diameter
have been used for the flip-chipped arrays on the DTA samples.
Figure 4.6 shows the measured coefficient of variation of VCSEL output power
over the array versus the applied current. The data shown come from the
same three virgin 8× 8 arrays of figure 4.5, and three 8× 8 samples after
hybridization on CMOS, substrate thinning, application of an antireflective
coating and a PCB mounting step at 230 °C. The uniformity of the wafer-
probed samples is excellent: the CV is around 1–2% when the current is
sufficiently above threshold. Measurements on 16× 16 arrays indicate similar
quality. The fully mounted DTA IC samples, however, exhibit a minimal CV
around 4% at 3mA that worsens to 6–8% with increasing current. Here, a
much broader range for Ith and H could be observed. The source of this
spread remains unknown; tests have shown that neither the hybridization nor
the elevated temperature for PCB mounting are directly responsible.
88 Statistical OE-VLSI modeling and characterization
8 µm oxide opening, on-wafer probing
10 µm oxide opening, on-wafer probing
12 µm oxide opening, on-wafer probing
fully mounted CMOS sample 1
fully mounted CMOS sample 2
fully mounted CMOS sample 3
co
e
ffi
cie
nt
 o
f v
ar
ia
tio
n 
CV
(%
)
L
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
VCSEL current i (mA)
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
VCSEL current i (mA)
(a) Measurement (b) Model
Figure 4.6: Coefficient of variation of VCSEL output power over the array. (a) Measure-
ment. (b) Calculated model according to equation 4.5.
The variation of the VCSEL response within an array can be adequately
modeled if we consider the joint distributions of Ith and H with moments µth,
σth, µH and σH , and ρ the correlation between Ith and H. Again by calculating
the variance of ln (L (i)), to a good approximation, the CV can be found to be
(CVL (i))
2 =
σ2L (i)
µ2L (i)
≈ σ
2
H
µ2H
+
σ2th
(i− µth)2
− 2ρ σHσth
µH (i− µth) (4.5)
for currents sufficiently above threshold—clamping at the horizontal axis for
i < Ith is not taken into account. As shown in figure 4.6, this simple model
provides a plausible explanation of the variance of the VCSEL output power.
For i > 6mA, the model and measurement lines for the CMOS samples start
to disagree, due to thermal roll-over in the VCSELs.
4.2.2.2 Dynamic response
The general dynamic VCSEL response has already been discussed in section
3.3.3.
Direct measurement of the inter-channel skew contribution of driver-VCSEL
pairs is not possible in the IO setup, as the central permutation switch (see
figure 4.20 on page 107) introduces nonuniform delay contributions in the
signal paths. However, the response time of a driven VCSEL is several
times shorter than the delay of a photodiode receiver circuit, and variations
of the VCSEL response time (variations of the order of 10 ps) should be
4.3 Light in ight 89
correspondingly shorter than receiver delay variations (several 10s of ps for
0.35–µm CMOS technology). Therefore the skew contribution of driver-VCSEL
pairs has a minor impact on full link scale. This may no longer be the case
in more advanced CMOS technologies, as the switching speed of currently
available VCSELs is comparatively lower.
We should mention here that the static uniformity of VCSELs can affect the
link skew as well, through the dependence of the receiver delay on the optical
modulation amplitude of the incident beam (figure 4.21).
4.2.2.3 Spectral uniformity
Uniformity of VCSEL spectra is important: chromatic dispersion introduces
skew between fibers, and the wavelength-dependent photodiode responsivity
adds to the divergence of received signal amplitudes. We have measured
the optical spectra of all VCSELs on one DTA IC sample, using a constant
current setting of 6.8mA, with all lasers emitting simultaneously. The average
center wavelength is 967.84 nm with a standard deviation of 0.36 nm and
a total range of 1.59 nm. This is very homogenous given the average rms
linewidth of 0.56 nm. The effect of wavelength variation on the time of flight
is negligible: the range of observed center wavelengths causes less than 100 fs
total skew inside 1m of glass fiber or plastic optical fiber. The change in
photodiode responsivity is present but not significant: at 25 °C, the maximal
relative difference is only 0.3%, increasing to 0.9% at 70 °C.
4.3 Light in ight
This section treats the modeling and characterization of the optical path, the
essential and most spacious part of an optical link. We focus on an optical path
with common graded-index (GI) multimode fibers, MT-like connectors and
direct fiber butt coupling—the kind of packaging and optical path discussed
in section 2.5.
4.3.1 Sources of nonuniformity in the optical path
By the uniformity of the optical path we refer to differences across channels of
the integrity of optical pulses, the attenuation and the delay. An optical path
based on telecom-grade GI fiber scores high marks on multiple aspects, which
also eases the analysis and reduces it to mainly an evaluation of coupling
efficiencies at fiber ends.
The matched velocities of different same-frequency modes in GI multimode
fibers yield a bandwidth-distance product of about 0.5–5GHz · km in glass
fiber [Corning, 2007] and up to the lower part of this range for plastic optical
90 Statistical OE-VLSI modeling and characterization
macrobend
losses
connector
coupling losses & crosstalk
VCSEL-fiber
coupling losses & crosstalk
fiber-photodiode
coupling losses & crosstalk
& absorption
 dispersio
n
Figure 4.7: Overview of significant effects in a fiber-based optical path
fiber (POF) [Zubia and Arrue, 2001]. Assuming a simple Gaussian response
model, the step response of a 1–m fiber will exhibit a 10–90% rise time between
0.07 and 0.7 ps (on top of a 0.1–ps chromatic dispersion contribution) which is
presently insignificant for all intents and purposes (and further on discarded).
For longer lengths, e.g., up to 100m for fiber-based Ethernet, this is no longer
the case.
With respect to the uniformity of the time-of-flight inside adjoining fibers,
Kanjamala and Levi [1995] have tested 10-wide ribbons made of 62.5 µm (core
diameter)/125 µm (cladding diameter) graded-index glass fiber and found
less than 0.25 ps/m skew. Furthermore, if a parallel optical path is terminated
with a 1–mm accuracy on the length of adjoining fibers or fiber ribbons, only
maximally a 2–3 ps time-of-flight skew is induced. We assume that, given due
care, the skew caused by optical time-of-flight differences can be ignored.
The attenuation in GI glass fiber is of the order of 0.001 dB/m [Corning, 2007].
In POF, a wide range of 0.01–10dB is covered by different materials and at
different wavelengths. POF opacity can thus be sizeable; the possibility of a
larger optical path attenuation is taken into account in the analyses of the next
chapters. This material related attenuation coefficient is generally uniform
and characterized by the manufacturer.
As a result of the above, the chief contribution to the average attenuation in
the optical path and its uniformity is the coupling at both ends of the optical
path and at any intermediate transitions. In a POF-based optical path, tight
fiber bends can affect bandwidth and attenuation figures as well. Macrobend
losses are not that important for glass fiber which mechanically disallows tight
bends; for a discussion on macrobend losses we refer to Winkler et al. [1979];
Ghatak et al. [1988]; Loke and McMullin [1990]; Arrue et al. [2001]; Makino
et al. [2005].
In this chapter, the coupling at the optical path extremities is modeled and
characterized. In the next section a measurement setup on the IO project
hardware is introduced to statistically quantify the coupling between a VCSEL
4.3 Light in ight 91
and a fiber or a fiber and a photodiode given their relative alignment and the
VCSEL drive current, taking irregular directional emission into account. Two
standard fiber sizes have been considered. In section 4.3.3, a simple ray tracing
model is fitted to the average measured VCSEL-fiber coupling efficiency given
the VCSEL-fiber alignment and the VCSEL current setting.
It turns out that the directional emission pattern of VCSELs and their thresh-
old/efficiency figures are statistically dependent, imposing a joint treatment
for other statistics than the average. Therefore the VCSEL process variations
and coupling efficiency uncertainty are treated together starting from sec-
tion 4.3.4, in which the alignment-conditional measurements are molded into
figures for fiber-coupled optical modulation amplitude and average optical
power. These will turn out to be essential quantities for the derivation of bit
error ratio, receiver delay and logic threshold bias in the next chapter. Finally,
section 4.3.5 discusses the modeling and characterization of the array-wide
coupled power at the VCSEL side when an 8× 8-fiber connector is plugged in.
Here, we specifically target the statistical dependence of optical modulation
amplitude and average optical power, and any dependences across array
positions.
4.3.2 Experimental characterization of ber coupling ef-
ciency
4.3.2.1 Measurement setup
We have realized a measurement setup to accurately quantify VCSEL-to-fiber
and fiber-to-photodiode coupling efficiencies. This setup is shown in figure
4.8. A prototype board—an IO project converter board or high-speed (0.18 µm
CMOS) board as shown in section 1.6—is securely attached to an optical
table. Using a setup with three orthogonal Newport 850F linear actuators,
we can position one end of an optical fiber at any location and elevation
over the VCSEL or photodiode arrays with a 1–µm accurate repeatability.
The other fiber end can be connected to another prototype board with an
MPO-like connector, or it is terminated with a FC/PC connector for coupling
to a standalone laser or optical power meter as required.
4.3.2.2 Experimental results
Transmitting side Several papers report on the coupling efficiency of a mul-
timode VCSEL and a butt-coupled fiber, based either on theoretical considera-
tions [Heinrich et al., 1997; Toffano et al., 2003] or measurements [Mohammed
et al., 2004]. However, to the author’s knowledge, no figures are available
accounting for the substantial variations between emission profiles of different
VCSELs. We have performed a large number of measurements on an 8×8
92 Statistical OE-VLSI modeling and characterization
X
s
a
 
t
ge
Y s
tag
e
Z stage
fib
er
 h
ol
de
r
te
s
b
ar
t 
o
d
Fi
gu
re
4.
8:
M
ea
su
re
m
en
t
se
tu
p
fo
r
fib
er
-d
ev
ic
e
co
up
lin
g
effi
ci
en
ci
es
4.3 Light in ight 93
Fi
gu
re
4.
9:
C
lo
se
-u
p
of
th
e
fib
er
ho
ld
er
ab
ov
e
a
lid
le
ss
PC
B-
m
ou
nt
ed
2.
5
G
bp
s
IO
pr
oj
ec
t
pa
ck
ag
e.
94 Statistical OE-VLSI modeling and characterization
0.0 dBm
2.0 dBm
3.0 dBm
3.5 dBm
3.9 dBm
4.2 dBm
20 µm 250 µm
1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56
57 58 59 60 61 62 63 64
Figure 4.10: Coupled power between a multimode 62.5 µm/125 µm graded-index glass
fiber and the VCSEL underneath while sweeping the horizontal plane at 50 µm over a
VCSEL array (i = 5.3mA). The contour plot is locally magnified 3 times around each
VCSEL center.
4.3 Light in ight 95
VCSEL array on a fully mounted DTA IC in order to construct a stochastic
model for the laser output and coupling efficiency.
Two standard graded-index multimode fibers have been tested: 62.5 µm/125 µm
with a numerical aperture (NA) of 0.275 and 50 µm/125 µm with a NA of 0.200.
For both fibers, the coupled power has been measured for 1m fiber length, 6
VCSEL currents between 2mA and 10mA, 7 elevations between 10 µm and
150 µm over the array surface, and a grid of 17×17 horizontal positions over
each VCSEL, with a height-dependent grid spacing. The result of this sweep
at one working distance is shown in figure 4.10. Provisions were also made for
tilting the fiber, but only an unrealistically large tilt could produce significant
changes. Given the long duration of six days of the automated measurement
for one array and one fiber, only one VCSEL array was considered.
The measured optical powers are expressed in mW and denoted by mv (i, r),
where i is the (actual) VCSEL current, v ∈ {1..nv} (nv = 64) the serial number
of the measured VCSEL according to figure 4.10 and r , (r, θ, z), where r and
θ the radius and azimuth of the fiber misalignment, and z the elevation over
the top array surface. Although our raw measurement data was taken at a
finite set of currents and misalignments, we extend m to a continuous domain
with respect to r and i, with intermediate values determined by interpolation.
This approximation is accurate enough given the density of the sampling
points.
The total VCSEL output power was determined in a separate measurement
with a large area detector and is denoted by lv (i). The coupling efficiency
is written as ηv (i, r) = mv (i, r) /lv (i). The total output power l has been
additionally measured on two other fully mounted arrays, yielding three sets
of samples (one for each array). Sufficiently similar distributions within these
sets are observed to suggest that the m-values of only one array will allow
meaningful statistical processing.
At the nominal working distance z = 50 µm, and in the absence of radial
misalignment, the coupling losses using the 62.5 µm/125 µm fiber are in the
range 0.8–1.1 dB at 2mA VCSEL current, worsening to 1.0–1.4 dB at 5mA. The
50 µm/125 µm fiber exhibits much larger coupling losses from 1.2–2.3 dB at
2mA to 1.6–3.0 dB at 5mA and is not further considered due to this result.
There is no apparent spatial correlation between adjoining VCSELs, and no
preferential azimuthal angle of emission could be observed.
Receiving side As shown in figure 4.11, The coupling efficiency at the re-
ceiver side is very uniform. As long as r < 15 µm and z < 100 µm (at least),
for both kinds of fibers there is no noticeable dependence of the observed
photocurrent on the fiber end position. At nominal working distance, a sig-
nificant decrease of the photocurrent begins to occur beyond 20 µm radial
misalignment. This is not surprising, as the 80–µm detector size was estab-
lished to accommodate a wide choice of fibers, e.g., step index fibers with a
broader emission. The fiber-photodiode coupling efficiency nonuniformity is
96 Statistical OE-VLSI modeling and characterization
0.50 0.80 0.95
20 µm 250 µm
Figure 4.11: Coupling efficiency between a multimode 62.5 µm/125 µm graded-index
glass fiber and the photodiode underneath while sweeping the horizontal plane at
50 µm over a photodiode array. The contour plot is locally magnified 3 times around
each photodiode’s center. Twelve receiver circuits had provisions for photocurrent
monitoring; one monitor was defective.
further on safely ignored as the contribution at the VCSEL side is much more
pronounced.
4.3.3 Coupling efciency estimation using a ray tracing
model
For given r, z and i, the average of ηv (i, r) over v and θ can be well approxi-
mated by the coupling efficiency resulting from a ray tracing-based simulation.
In this simulation, the VCSEL is represented as a simple Gaussian beam
parameterized by the location of its beam waist below the array surface zref
and minimal spot size w0, which is dependent on the VCSEL current to ac-
count for bundle widening with rising current caused by increasingly excited
higher-order VCSEL modes. The beam parameters zref and w0 are treated
as fitting parameters. Rays are randomly generated according to the beam
profile and classified according to their point of impact and orientation in a
straight piece of parabolic-index fiber. This fiber is given a core radius and
numerical aperture corresponding to the measurement. The behavior of the
ray in the fiber can be determined analytically, yielding a quick classification
into bound, tunneling and refracting ray categories. The programming code
of this model is included in appendix C.
The ratio of the power in the bound rays to the power in all considered rays
yields an estimate of the coupling efficiency. Figure 4.12 shows the measured
and estimated average coupling efficiencies for the 62.5 µm/125 µm fiber at
5.3mA VCSEL current. When the fitting parameters are redetermined for
different VCSEL currents, zref indeed retains a constant value of 24.5 µm,
roughly corresponding to the real location of the (bottom-emitting) VCSELs
below the array surface. The change of the minimal spot size w0 with the
4.3 Light in ight 97
20
40
60
80
100
120
140
0 5 10 15 20 25 30 35
e
le
va
tio
n 
z 
(µm
)
radial distance r (µm)
0.75
0.70
0.65
0.55
0.45
0.40
0.35
0.30
0.
25
0.
20
0.
15
0.
10
0.50
0.60
Figure 4.12: Contour plot of measured coupling efficiencies ηv (5.3mA, r) for different
(r, z)-misalignments, averaged over v and θ (62.5 µm/125 µm fiber). The dashed lines
represent calculated coupling efficiencies, each resulting from a ray tracing-based
simulation of a Gaussian beam and a parabolic-index fiber.
current translates into a steadily increasing divergence with increasing current.
The resulting Gaussian beam divergence λ/piw0 is 15° at 5mA, with a slope
of 0.56°/mA. At corresponding currents, the measurements for both fiber
sizes are well approximated using the same fitting parameters, suggesting that
this simulation approach can be extended to estimate the average coupling
efficiency for other choices of core radius and numerical aperture.
4.3.4 Misalignment-conditional characterization of VCSEL
process variations
At the nominal fiber position, the coupling efficiency appears uncorrelated
with VCSEL process variations. However, as the radial misalignment increases,
the coupling efficiency decreases much faster for brighter than for darker
VCSELs. For calculations involving variations of the fiber-coupled power,
we therefore have to consider the VCSEL process variations and coupling
efficiency uncertainty together.
The fiber-coupled optical modulation amplitude (OMA) and average optical
power (AVG) are essential quantities for the forthcoming derivation of bit
error ratio, receiver delay and logic threshold bias. We will now examine
the influence of VCSEL process variations on these quantities. Assuming
that there is neither a spatial correlation nor a systematic variation of the
VCSEL properties over the array, all VCSELs are independent samples from a
common distribution. We characterize a VCSEL by its current- and alignment-
conditional fiber-coupled optical modulation amplitude and average optical
power (using the 62.5 µm/125 µm fiber). These powers are written as random
variables OMA (i0, i1, r), AVG (i0, i1, r), where i0 and i1 represent the respective
98 Statistical OE-VLSI modeling and characterization
10
20
30
40
50
60
70
80
90
0 2 4 6 8 10 12 14 16
e
le
va
tio
n 
z 
(µm
)
radial distance r (µm)
0.999
0.99
0.95
0.800.90
0.10
0.200.30
0
.40
0.7
0
0
.60
0.5
0
-1.0
-1
.1
-1
.2 -1.3
-1
.4
-1.5 -1
.6
-1.7 -1
.8
-1
.9 -2
.0
-2
.1
-2.2 -2
.3
-2
.4 -2
.5
-2
.6
Figure 4.13: The continuous lines show the average optical modulation amplitude
(dBm) for a given radial fiber butt misalignment r and elevation z at i0 = 1.6mA and
i1 = 3mA. At this current setting, the average total VCSEL output power is 0dBm.
The dashed contours are isoprobability lines of fiber butt positions, where different
fiber-laser combinations in the array have been taken into account equally. The total
probability inside each contour is indicated.
low and high VCSEL currents, and r is defined in the same way as in our
m-measurements (fiber tilt is again ignored). We represent these quantities
logarithmically (in dBm) so as to transform multiplicative disturbances (e.g.,
slope efficiency and coupling efficiency variations) into additive ones.
Joint sample data of these random variables can be obtained from the mea-
surements:
oma[dBm]v (i0, i1, r) = 10 log10
[
m[mW]v (i1, r)−m[mW]v (i0, r)
]
(4.6)
avg[dBm]v (i0, i1, r) = 10 log10
[
m[mW]v (i0, r) +m
[mW]
v (i1, r)
2
]
(4.7)
Assuming the absence of preferred azimuthal laser-fiber alignment angles as
observed above, the joint distribution of (OMA,AVG) (i0, i1, r) is independent
of θ. A full working set of sample data for given (i0, i1, r, z) can be constructed
from the nv VCSELs with known coupling data using equations 4.6–4.7 with
sufficiently dense resampling in the θ-direction (typically r · ∆θ ≤ 1 µm).
Distribution tests for different laser currents (above 1.5mA) in the relevant
(r, z)-range (the outermost dashed contour in figure 4.13) justify a joint normal
distribution
(OMA,AVG) (i0, i1, r) ∼ N2
([
µOMA µAVG
]
,[
σ2OMA ρOMA,AVGσOMAσAVG
ρOMA,AVGσOMAσAVG σ
2
AVG
])
, (4.8)
4.3 Light in ight 99
where all parameters of the normal distribution are functions of (i0, i1, r, z).
Estimates for these parameters can be constructed from the sample data in the
standard way. Figure 4.13 illustrates µˆOMA (1.6mA, 3mA, r, z) for different r
and z.
The results per se are not particularly useful towards system-level interpre-
tation unless they are combined with the stochastic behavior of the fiber
alignment, which is treated in the next section.
4.3.5 Stochastic evaluation of VCSEL-ber power coupling
in a connectorized package
When a fiber bundle is connectorized, the positions of the fiber facets are
mutually fixed relative to each other as well as relative to the alignment
features of the connector. When the connector is plugged into a package,
the misalignment of laser-fiber pairs at different positions in the array is
therefore strongly correlated. Translational mismatches cause only uniform
misalignments of laser-fiber pairs as long as the orientation of the fiber facet
array w.r.t. the optical array is constant. However, a rotation of the connector
in the plane of the optical array can introduce severely nonuniform radial
misalignment, and connector tilt causes the fiber facets to be positioned at
different working distances.
It turns out that both the inter-VCSEL variability and the fiber bundle align-
ment uncertainty are significant. The variation of fiber-coupled optical mod-
ulation amplitude and average optical power over different fiber-laser pairs
in the array characterizes the array-wide transmitter-side uniformity. We
will now derive a stochastic model for array-wide laser-fiber coupling in a
connectorized package, and determine array-wide power distributions.
We consider packages prepared using the index alignment method (section
2.5.3.3 on page 46), applied to the DTA modules of the IO project (one 8×8
laser and one 8×8 photodiode array). The connectorized fiber bundles use
62.5 µm/125 µm fibers, terminated with the connector design of figure 2.12(a)
(page 44).
We designate the physical position of laser-fiber pairs in the array by a serial
number d ∈ {1..nd} (nd = 64) according to figure 4.10. The letter d is used
instead of v to distinguish between array positions in general (d) and any
particular VCSEL from the measurements (v). The reason of this distinction
will be clear further on as we will apply statistical resampling (bootstrapping)
of the examined VCSELs in order to obtain more robust estimates.
Consider an experiment where one mates a random package with a random
connectorized fiber bundle. The fiber-coupled optical powers of interest are
denoted by nd-dimensional vectors of random variables OMA (i0, i1) and
AVG (i0, i1) (in dBm), with one component for each laser-fiber pair (written
as, e.g., OMAd (i0, i1) for the component corresponding to array position
100 Statistical OE-VLSI modeling and characterization
d). Given that the drive current nonuniformity will turn out to be almost
always negligible compared with VCSEL process variations and connector
alignment variations, actual low and high VCSEL currents i0 and i1 are used
to parameterize OMA and AVG. Only when imod,aim  1mA, drive current
nonuniformity needs to be taken into account (this can be catered for by adding
the deviation of the targeted modulation current, in dB, to OMA (i0, i1)). In
numerical evaluations, i0 is chosen to be twice the laser threshold (around
1.6mA)—this assures i0 > 1.5 Ith array-wide, required for a good BER (see
subsection 4.4.3)—and i1 is swept from i0 to i0 + 7mA in steps of 0.1mA.
It is our goal to derive the (joint) statistical properties of OMA (i0, i1) and
AVG (i0, i1). To that end, a model for the uncertainties in the VCSEL-fiber
alignment, denoted by A, is combined with the model for laser-fiber coupling
subjected to VCSEL process variations derived above. To derive the probabil-
ity distribution of the joint mechanical alignment, fellow researcher Olivier
Rits has performed accuracy measurements on the prototypes, and different
tolerances that independently contribute to the joint alignment have been
characterized. These tolerances are listed in figure/table 4.14. The combined
effect of these tolerance data leads to an adequate stochastic model of the
alignment.
In appendix D.1, we derive key properties of the joint distribution of the
vectors OMA (i0, i1) and AVG (i0, i1). Only conclusions are mentioned here.
It is observed that the uncertainty of power figures at any array position is
primarily caused by the variation of VCSEL characteristics, rather than by
misalignment. Effects with array-wide correlation are produced by significant
global translational misalignment, and the smaller impact of global rotational
misalignment is dwarfed by the VCSEL process variations. For that reason, we
choose a natural way of decomposing OMA (i0, i1) and AVG (i0, i1) as a sum
of a global alignment-related contribution and component-specific deviations
(caused by VCSEL variability and thus considered i.i.d. across different array
positions):
OMAd (i0, i1) = E [ 〈OMA (i0, i1)〉| A] + ∆OMAd (i0, i1) (4.9)
AVGd (i0, i1) = E [ 〈AVG (i0, i1)〉| A] + ∆AVGd (i0, i1) (4.10)
The angular brackets denote an averaging over all vector components (all
array positions). Sample data for the global alignment-related contributions
can be generated using the Monte Carlo method mentioned in appendix
D.1. The correlation between E [ 〈OMA (i0, i1)〉| A] and E [ 〈AVG (i0, i1)〉| A] is
always > 0.99; both are considered identical except for a difference in location
and scale. Their distribution is significantly left-skewed (see figure 4.15).
A statistical dependence between the global alignment-related contribution
and component-specific deviations is present but turns out to be sufficiently
small to be ignored. The component-specific deviations ∆OMAd (i0, i1) and
∆AVGd (i0, i1) for a given position d exhibit no significant skew and approxi-
4.3 Light in ight 101
5 4
1
23
# feature relative to error type
maximal
deviation
(4.5σ)
distribution
1 each ber facet connector radial position 2µm 2D normal
2 each ber facet connector axial position 10µm normal
3 connector protrusion thickness 20µm normal
4 MT pin guide hole connector radial position 1.5µm 2D normal
5 MT pin guide hole diameter 0.5µm normal
6 MT pin xing hole module lid radial position 2µm 2D normal
7 MT pin xing hole diameter 1µm normal
8 MT pin diameter 1µm normal
9 optical array surface CMOS surface distance 10µm normal
10 optical array surface CMOS surface radial position 1µm not modeled
11 module lid center CMOS surface distance ±10µm uniform
12 module lid center CMOS surface in-plane position ±2µm 2D uniform
13 module lid CMOS surface rotation ±0.05° uniform
14 module lid CMOS surface tilt ±0.30° uniform
13
14
12
6
8
7
11
9
10
Figure 4.14: Package and connector tolerances (courtesy of Olivier Rits)
102 Statistical OE-VLSI modeling and characterization
0
500
1000
1500
2000
2500
3000
3500
4000
4500
-1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1
sa
m
pl
e 
co
un
t
fiber-coupled optical modulation amplitude (dBm)
Figure 4.15: Histogram of the array-averaged optical modulation amplitude
E [ 〈OMA (1.6mA, 3mA)〉| A] (100000 samples, bin size 0.01dB). The distribution
is clearly left-skewed.
mately have a position-independent joint normal distribution. This distribu-
tion has zero expectations, standard deviations denoted by σ∆OMA (i0, i1) and
σ∆AVG (i0, i1), and a correlation coefficient ρ∆OMA ,∆AVG (i0, i1).
Figure 4.16 shows estimates of the mean and standard deviations of
E [ 〈OMA (i0, i1)〉| A] and E [ 〈AVG (i0, i1)〉| A], and all ∆-related distribution
parameters, as a function of the modulation current, using fixed i0 = 1.6mA.
The expected value of the global contribution to OMA, when represented
in mW, corresponds to the modulation power extracted from the average LI
curve, subjected to 1.15 dB coupling loss and an additional loss of 20 µW/mA
with rising modulation current. The standard deviation of the alignment-
related contributions is around 0.05–0.15 dB. The VCSEL process variations
come about in ∆OMA as a constant standard deviation of 0.30 dB. The course
of σˆ∆AVG and ρˆ∆OMA ,∆AVG is a natural consequence of the positive correlation
between threshold current and slope efficiency across VCSELs, and becomes
apparent when identifying i0 and the different i1 on figure 4.4.
4.4 From light to chip
4.4.1 Photodiode array uniformity
4.4.1.1 Static uniformity
The general photodiode behavior has already been touched upon in section
3.3.4; here, we briefly recapitulate in order to introduce uniformity aspects
more easily.
An optical beam with power l incident on a photodiode induces a current I
4.4 From light to chip 103
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
0 1 2 3 4 5 6 7
o
pt
ica
l p
ow
er
 (m
W
)
modulation current (mA)
(a) mean global contributions
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0 1 2 3 4 5 6 7
-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
st
a
n
da
rd
 d
ev
ia
tio
n 
(dB
)
co
rr
e
la
tio
n 
co
ef
fic
ie
nt
modulation current (mA)
(b) deviations and correlation
Figure 4.16: Estimated mean and standard deviation of the misalignment-caused
array-averaged fiber-coupled modulation and average power; and parameters of the
joint normal distribution of
(
∆OMAd ,∆
AVG
d
)
, the per-VCSEL power uncertainty caused
by VCSEL process variations. The off-current i0 is fixed at 1.6mA; the modulation
current i1 − i0 is indicated on the horizontal axis.
104 Statistical OE-VLSI modeling and characterization
0.62
0.64
0.66
0.68
0.70
0.72
0.74
965 966 967 968 969 970 971 972 973 974 975
re
sp
on
siv
ity
 E
[R
] (A
/W
)
wavelength (nm)λ
25 °C
35 °C 45 °C
55 °C
65 °C
75 °C
85 °
C
Figure 4.17: Calculated mean photodiode responsivity versus wavelength at different
temperatures.
according to I (l) = Id+ R · l, where Id is the dark current and R the responsivity.
Both parameters are interpreted as random variables across photodiodes. The
dark current is a temperature dependent diode saturation current:
Id (T) = Id,ce−e/kT , (4.11)
where Id,c depends on the geometry, e ≈ 0.48 eV is the InGaAs band gap
energy, k is the Boltzmann constant, and T represents the immediate ambient
temperature (not treated as a random variable although written in upper
case). At 300K, the measured dark current over one IO project array is 7.3 nA
on average, with a total range of 4.7 nA (measurements courtesy of Albis
Optoelectronics). Even with an extrapolated value of 150 nA at 85 °C, the
contribution of this current to the photocurrent is negligible.
The responsivity R is wavelength and temperature dependent, caused by an
undesirable optical absorption in the InP substrate: the operating wavelength
of 970 nm is coming close to the optical absorption edge around 950 nm.
The intended absorption in the InGaAs layer can be treated as a constant in
comparison. The responsivity can be modeled by
R (λ, T) = R0e−α(λ,T)d, (4.12)
where R0 is the maximal responsivity, d = 30 µm represents the substrate
thickness and α (λ, T) is the absorption coefficient in the Urbach region, as
described in [Beaudoin et al., 1997]. As shown in figure 4.17, the VCSEL center
wavelength uniformity becomes increasingly important for the uniformity of
the responsivity at elevated temperatures. In the IO setup, the excellent VCSEL
wavelength uniformity mentioned above yields an insignificant multiplicative
contribution to the responsivity which can be safely ignored (σ/µ < 0.2%).
4.4 From light to chip 105
supply
transimpedance
amplifier
limiting
amplifier
postamplifier
decision level control
output
AC-JTAG cell: boundary scan
and on-wafer circuit testing
Figure 4.18: Simplified view of the receiver circuit
At 300K and λ = 970nm, an average responsivity of 0.67A/W was measured
across devices before thinning (d = 150 µm). The coefficient of variation of
R0 was measured—with the same, physically moved, incident beam—over 11
devices after full processing, using photocurrent monitoring CMOS circuits,
and found to be 2.4%. This result is only an upper bound, as the photocurrent
monitoring circuits serve more as a diagnostic tool than as a precise instrument
and hence contribute to the measured variation as well.
4.4.1.2 Dynamic uniformity
The speed of the receiver circuit is dominated by a RC time constant, where
C is sum of the photodiode capacitance and the receiver input capacitance
and R is the input resistance of the transimpedance amplifier. Variation of C
across devices therefore adds to the signal skew. The sensitivity of the receiver
circuit delay to input capacitance changes was extracted from a simulation and
found to be around 0.5 ps/fF. Photodiode capacitances have been measured
(courtesy of Albis Optoelectronics) on an 8×8 array before hybridization at a
nominal bias of -2V, resulting in an average capacitance 505 fF with a total
range of 7.6 fF (standard deviation 1.8 fF). The impact of a variation this small
on the receiver delay is negligible.
4.4.2 Receiver circuit delay and skew
4.4.2.1 Introduction
Figure 4.18 shows again the scheme of the receiver circuit; it was discussed in
section 2.6.
For characterization purposes, we always consider photodiode-receiver pairs
as a whole as we cannot observe internal signals (e.g., the photocurrent). As
106 Statistical OE-VLSI modeling and characterization
Agilent 81134A
pulse generator
1.25 GHz
31PRBS 2 -1
connector + fiber
+ adjustable
attenuation
+ alignment stage
detector receiver
detector receiver
…
receiving board
estimated
delay A
estimated
delay B pe
rm
u
ta
tio
n
sw
itc
h
…
∆t2
Advantest D3286
BER tester
Oscilloscope
∆t3 ∆t5∆t4
datatrigger
∆t6
datatrigger
} differe
n
tia
l
o
u
tp
u
ts
transmitting board
driver VCSEL
∆t1
Figure 4.19: Schematic overview of the receiver skew measurement setup.
the decision threshold adapts itself to the average signal value, the incident
optical modulation amplitude oma is used as a base quantity instead of
the absolute photocurrent. We have set up a measurement to quantify the
effect of oma on the latency and the bit error ratio (BER) of a receiver circuit,
and across receiver circuits. The output skew between different photodiode-
receiver pairs is quantified as well. Figure 4.19 gives a schematic overview
of the measurement setup. An Agilent 81134A pulse generator is used as
the time base for the system. It generates a reference clock of 1.25GHz
and a derived pseudorandom binary sequence (PRBS) signal with period
231 − 1. This PRBS signal is used to modulate one laser of an 8×8 array
between controllable current levels. A multi-fiber bundle is terminated with a
MPO-like connector at one side only and connected to the array. The single
illuminated fiber is isolated from the bundle at the unterminated side, cleaved,
and mounted into our motorized single fiber alignment setup (figure 4.8). This
alignment setup can be controlled to illuminate any of the 8×8 photodiodes
of a prototype board using the same fiber. The on-chip permutation network
can be configured to route the electrical signal output of any chosen receiver
circuit to the same fast differential output pads. The differential output is split
into two ground-referenced signals, one attached to an oscilloscope and the
other to an Advantest D3286 BER tester. Both instruments are triggered by
the 1.25GHz reference clock.
4.4.2.2 Receiver skew
Characterization of absolute latencies is not targeted. We want to measure
only differences between the latencies of several receiver circuits (= skew) or
the variation of the latency of one receiver circuit when oma is varied.
4.4 From light to chip 107
Figure 4.20: Permutation switch of the DTA IC (see also figure A.2 on page 172).
Not all paths through the switch have the same length, which would render skew
measurements at the transmitting side useless.
To compensate for the unknown and route-dependent delays of the permuta-
tion switch (figure 4.20), a low-skew clock signal can be injected right behind
the receiver output. The observed zero crossings of this clock signal estab-
lishes a good timing reference for data measurements. The skew caused
by remaining path length differences from the receiver circuit to the clock
insertion multiplexer was accounted for using an Elmore estimate of the path
delay [Rubinstein et al., 1983].
BER and eye diagram measurements have been performed on 32 receivers with
an 11-step controllable attenuation in the optical path of 12–22dB, yielding
very low power levels, suitable for receiver testing purposes. Measurements
have been performed for 5 laser current configurations, with dc bias currents
between one and two times the VCSEL threshold. The values for oma at
the fiber exit are in the range -10–‌-22 dBm. In total, 32 × 11 × 5 = 1760
configurations were explored. For each configuration, a necessarily short
BER measurement of one minute was performed, allowing accurate BER
quantification above 4 · 10−11. Simultaneously, an eye diagram containing 221
samples of the PRBS signal was captured and analyzed (figure 4.23 shows an
example).
Figure 4.21 shows the extracted receiver delays—up to a constant common to
all channels—versus oma for one laser configuration and a controllable optical
attenuation. The increase of receiver delays with decreasing oma amounts to
17.5 ps/dB on average and is very uniform across channels (standard deviation
108 Statistical OE-VLSI modeling and characterization
0
50
100
150
200
250
300
-19 -18 -17 -16 -15 -14 -13 -12 -11
de
la
y 
(ps
)
incident optical modulation amplitude (dBm)
Figure 4.21: Measurement of receiver delay versus incident optical modulation am-
plitude on 32 photodiode-receiver combinations (laser i0 = 1.2mA and i1 = 2.9mA;
controllable optical attenuation of 12–19 dB). The time origin is arbitrarily chosen.
of 1.8 ps/dB, excluding one outlier). At given oma, the skew exhibits 42 ps
standard deviation and a total range of 175 ps. This figure includes effects
caused by photocurrent variations due to responsivity variations across the
photodiodes, process variations across receiver circuits, clock distribution skew
and residual errors in the compensation of electrical path length differences.
4.4.3 Bit error ratio characterization across photodiode-re-
ceiver pairs
The signal at a TIA output is corrupted by various sources of noise [Pepelju-
goski and Kuchta, 2003]. Part of this noise originates as additive noise at
the laser side, and is attenuated together with the actual signal, before it
arrives at the photodiode input. Hence it can be considered proportional to
oma[mW]. Different noise contributions between logic high and low signals are
not discernible in our measurements.
Another part of the noise originates in the photodiode and in the TIA itself.
This part is considered independent of the signal amplitude. Hence, we model
the additive noise signal N at output of the k-th TIA as follows:
Nk = oma[mW] · NTX + NRX,k, (4.13)
where NTX and NRX,k (for all k) are independent Gaussian noise processes.
The TIA output signal is fed into a limiting amplifier where it is compared
to the decision threshold. This threshold is based on a long-time signal
average, with a slight constant positive bias to avoid output oscillations in
dark photodiode receivers. If oma decreases below about -22 dBm, the eye
reduces to a baseline signal.
4.4 From light to chip 109
10-11
10-10
10-9
10-8
10-7
10-6
10-5
10-4
10-3
10-2
-18 -17 -16 -15 -14 -13 -12
bi
t e
rro
r r
at
io
incident optical modulation amplitude (dBm)
Figure 4.22: Measurement of bit error ratio versus incident optical modulation am-
plitude on 32 photodiode-receiver pairs (laser i0 = 1.2mA, i1 = 2.9mA; controllable
optical attenuation of 13–18dB). The apparent occurrence of two curve families is an
illusion caused by unavailable BER data for the worst channels at -17 dBm (due to loss
of sync), and for the best channels at -12 dBm (not enough errors observed to extract a
reliable average BER).
The resulting digital signal is fed into the BER tester. As we assume no further
signal degradation in the limiting amplifier and the subsequent digital path,
the observed BER corresponds to the equivalent BER figure at the TIA output,
computed at the sampling instant corresponding to the optimal sampling
point determined by the BER tester.
The BER can be estimated as follows. The variance of each TIA’s output noise
signal is given by
σ2N =
(
oma[mW]σNTX
)2
+ σ2NRX . (4.14)
Letting β denote the decision threshold bias, we find [Maxim AN576, 2001]
BER =
1
2
[
Φ
(
β− oma2
σN
)
+Φ
(−β− oma2
σN
)]
, (4.15)
where β, oma and σN are in milliwatt, a dc-balanced signal is assumed and
Φ denotes the cumulative distribution function (cdf) of the standard normal
distribution. A good fit of the measurements (of figure 4.22) can be obtained
with σNRX ≈ −27.5dBm, σNTX = 0.059 and β ≈ −23.8dBm (towards logical
1).
The residual variation among channels can be attributed to photodiode respon-
sivity variations and variations in the TIA preamplifier gain, among others.
This is modeled as an additional common contribution to OMA[dBm] and
AVG[dBm] with a standard deviation of approximately 0.29 dB. When taking
this value, the standard deviation of E [ 〈OMA (i0, i1)〉| A] and σˆ∆OMA into
110 Statistical OE-VLSI modeling and characterization
account (see figure 4.16 on page 103 for values), the modulation amplitude of
the signal before the limiting amplifier of any interconnect channel will suffer
a combined process variation σall with a standard deviation of 0.44dB.
The BER curves (versus oma, obtained by a controllable optical attenuation)
essentially coincide across laser current settings as long as the current for
a logic zero is above approximately 1.5 times the VCSEL threshold Ith. At
lower current levels above threshold, the BER can get up to three orders of
magnitude worse than projected. The slower modulation response at low bias
is the probable cause of this effect.
4.4.3.1 Impact of timing jitter and offset on the bit error ratio
The BER figures analyzed so far assume sampling at the optimal sampling
point of the eye diagram. In order to evaluate what happens when the
sampling instant would be derived from a common clock signal, we have to
quantify the degradation of the BER when sampling at other points than the
optimal one.
We approximate the eye diagram at the TIA output by means of eight signals,
corresponding to the bit patterns 000, 001, . . . , 111, where we focus on the
central bit (centered around the time origin). Four of these—p1 (t), . . . , p4 (t)—
correspond to a central bit of 1; the other four are denoted q1 (t), . . . , q4 (t)
and correspond to a low central bit.
The eye diagram is then composed of many superpositions of these curves,
randomly shifted vertically by the additive signal noise N, and randomly
shifted horizontally by the timing jitter J. This jitter is assumed to have
a normal distribution and is caused by a variety of effects, among which
intersymbol interference and clock jitter. The time difference between the
sampling epoch and the center of the eye is denoted Ts. The expected BER
when sampling at the relative sampling time Ts (with expectation ts,µJ and
variance σJ) is then given by
BER =
1
8
[ 4
∑
i=1
∞∫
t=−∞
Φ
(
β− pi (t)
σN
)
dΦ
( t− ts,µJ
σJ
)
+
4
∑
i=1
∞∫
t=−∞
Φ
(−β+ qi (t)
σN
)
dΦ
( t− ts,µJ
σJ
) ]
(4.16)
To evaluate equation 4.16 we need a good estimate of σJ . This can be obtained
from the eye diagrams, where the width of the eye crossing (see figure 4.23)
is determined by the jitter in the first place, but also by the additive noise
on the signals pi (t) and qi (t) through their zero-crossing slope m. The total
apparent jitter at the eye crossing is then
σ2J + (σN/m)
2 . (4.17)
4.4 From light to chip 111
0
100
200
300
400
500
-800 -600 -400 -200 0 200 400 600 800
hi
t c
ou
nt
time (ps)
histogram
 area
Figure 4.23: Eye diagram and horizontal histogram around the eye crossing level (laser
i0 = 1.2mA, i1 = 2.9mA; 15dB optical attenuation). This eye has BER = 4 · 10−8 at
the optimal sampling time.
Our measurements can be fitted to σJ ≈ 21.5ps and a zero-crossing slope
m corresponding to a 10–90% rise time of 275ps when approximating the
edges of pi (t) and qi (t) by hyperbolic tangents (as is done, e.g., in [Sabet
and Ilponse, 2001]). At a given oma, the BER can be plotted as a function
of both the mean relative sampling time ts,µJ and the threshold offset β to
assess the impact of timing jitter and offset. Figure 4.24 shows such a plot for
parameters corresponding to oma = −9dBm. We observe that small changes
of the threshold offset β can already have a profound impact on the BER.
At this oma setting, βauto/oma[mW] ≈ 0.035 when using automatic threshold
detection.
When the mean relative sampling time ts,µJ is sufficiently far from the eye
crossing, the BER is essentially independent of the precise sampling instant in
an interval of several 100s of picoseconds. It turns out that when a given BER
is proposed, the length of the sampling window (denoted τs) in which the
proposed or better BER is attained is relatively insensitive to the exact values
of oma and β, as long as the noise floor given these values is significantly
lower than the proposed BER. For instance, at 1.25Gbps, given a proposed
BER of 10−12, in the IO receiver circuits τs is 360 ps at oma = −9dBm, rising
to only 410 ps at oma = 0dBm. For oma < −9dBm, the noise floor comes too
close to 10−12 and τs shortens significantly.
Given a desired array-wide oma ≥ −9dBm and a 6σall safety range, an average
oma of −6.4dBm should be targeted. This corresponds to a minimal VCSEL
modulation current of 0.42mA at 20 °C. At an elevated temperature of 70 °C
this value becomes 0.60mA due to the derating of the laser efficiency (−1.3dB)
and the photodiodes (< −0.25dB). At a nominal current configuration i0 =
112 Statistical OE-VLSI modeling and characterization
-0.5
-0.4
-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
0.4
0.5
-400 -300 -200 -100 0 100 200 300 400
mean relative sampling time         (ps)
th
re
sh
ol
d 
of
fs
et 10-15
sampling window
10-12
10-12
10-9
10-6
10-3
10-2
10-1
10-1
10-2
10-3
10-6
10-9
Figure 4.24: Calculated expected BER for oma = −9dBm at 1.25Gbps as a function of
both the mean relative sampling time ts,µJ and the threshold offset β.
1.6mA/i1 = 3.0mA, the signal modulation amplitude still shows a surplus
of 3.7 dB which can serve to compensate the increased overall noise figure at
higher temperatures.
4.5 Putting things to work
We are now in a position to address the questions formulated at the beginning
of this chapter: (i) can we use a reference channel from which the sampling
instant is derived for all channels, obviating the need for clock recovery
circuitry for each channel; and (ii) can we use a reference channel from which
the decision threshold for all channels can be obtained, obviating the need for
dc-balanced coding?
4.5.1 Common sampling instant
We assume that we keep the per-channel threshold and dc-compensated
channel coding, and employ a circuit architecture where all channels are
transmitting data truly synchronously [Gui et al., 2005]. One channel (channel
number d = 0) is chosen as dedicated clock channel and is used to transmit
a continuous symmetric square wave signal. At the receiver side, one clock
synchronization circuit is used, which phase-locks on the center of the bit
period of the dedicated clock signal. A locally generated clock is distributed
to sample the receiver output of all other channels.
We shall now estimate the probability of attaining a BER of 10−12 over all
data channels simultaneously. We assume a timing jitter σJ comparable to the
value reported in the previous section. We have accounted for this jitter in the
4.5 Putting things to work 113
derivation of the BER curves as a function of the sampling instant (figure 4.24).
The cross-section at a constant threshold offset β has a bathtub shape, the
width of which is virtually independent of β and oma, as long as the latter is
sufficiently big. From the previous section, a value τs = 360ps is a reasonable
assumption for the length of the sampling window in which the BER < 10−12
when the noise floor is sufficiently below this limit.
The individual (jitter-free) sample times Ts,d in each of the nd− 1 data channels
can be written as
Ts,d = Tcd,d − Tcd,0, (4.18)
where Tcd,d is composed of the delay of the d-th receiver circuit as well as
the clock distribution delay from a common clock node. We assume that
the feedback path in the clock recovery circuit has a similar delay as the
distribution path to the data receivers.
The delays Tcd,d can be considered i.i.d. with a normal distribution N
(
δ, σ2s
)
.
Note that the independence assumption is not valid for e.g. clock distribution
trees, but it puts us on the safe side in the following derivation. The skew
figure of 42 ps extracted from the data of figure 4.21 provides an estimate
for σs; we use 60ps in our calculations to account for an equal amount of
transmitter-side skew (again erring on the safe side as the skew contribution at
the transmitter side is likely lower than at the receiver side). The dependence
of Tcd,d on the exact optical modulation amplitude can be safely ignored: the
all-inclusive standard deviation σall of OMA
[dB] of about 0.44 dB causes only
7.7 ps skew according to the 17.5 ps/dB receiver delay sensitivity on the optical
modulation amplitude extracted above.
The sought probability is given by
P
[
−τs
2
< all Ts,d ≤ τs2
]
= E
[
P
[
−τs
2
< all Ts,d ≤ τs2
∣∣∣ Tcd,0]]
= E
[(
P
[
−τs
2
< Tcd,1 − Tcd,0 ≤ τs2
∣∣∣ Tcd,0])nd−1]
=
∞∫
tcd,0=−∞
[
Φ
( τs
2 + tcd,0 − δ
σs
)
−Φ
(− τs2 + tcd,0 − δ
σs
)]nd−1
dΦ
(
tcd,0 − δ
σs
)
(4.19)
In the IO project setup with per-channel resynchronization, an array-wide
BER < 10−12 can be obtained on all 64 channels. When using a common
sampling instant, the substitution of our experimental data into equation 4.19
evaluates to a yield of 56% for 64 channels at 1.25Gbps. The bitrate can
however be lowered to attain a higher yield. Figure 4.25 shows the maximal
bitrate as a function of the number of channels nd if a 99.9% yield of a given
114 Statistical OE-VLSI modeling and characterization
0.7
0.8
0.9
1.0
1.1
0 10 20 30 40 50 60 70 80 90 100
m
a
xi
m
al
 b
itr
at
e 
(G
bp
s)
number of channels
BER=10-12
BER=10-9
64
Figure 4.25: Maximal synchronous bitrate as a function of the number of channels nd
if a 99.9% yield of a given array-wide BER is required.
array-wide BER is required. For the extrapolation to bitrates lower than
1.25Gbps the random jitter σJ was (pessimistically) chosen to scale with the
bit period. It is clear that the maximal bitrate initially falls quickly at very
small nd (< 20), but the decrease is much slower as nd increases further. This
result indicates that efficient source-synchronous parallel optical interconnect
is well in reach of well-crafted systems optimized for this mode of operation.
4.5.2 Common logic threshold
Here we assume that the logic threshold derived from the average photocur-
rent of one dedicated channel—transmitting a continuous symmetric square
wave—is used as a logic threshold in all other channels. The signal sampling
times are now considered to be near the center of the eye where the noise
behavior is stationary.
Again we will estimate the probability of attaining a BER figure of 10−12
over all data channels simultaneously. The logic threshold extracted from the
dedicated channel (at channel number d = 0) is written as AVG0; the global
connector alignment is denoted A. Conditionally on A and AVG0, the events
of different data channels attaining a BER < 10−12 are independent. Hence
the joint probability can be expressed as a product:
P
[
BER of (nd − 1) channels < 10−12
]
= E
[
P
[
BER of (nd − 1) channels < 10−12
∣∣∣ A,AVG0]]
= E
[(
P
[
BER of 1 channel < 10−12
∣∣∣ A,AVG0])nd−1] (4.20)
4.5 Putting things to work 115
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
yie
ld
 (%
)
number of channels
i0=1.6 mA/i1=3.0 mA
i0=1.6 mA/i1=5.0 mA
64
10 -9
10-6
10 -6
10 -9
10
-12
Figure 4.26: Probability of attaining the indicated BER array-wide as a function of the
number of channels nd, at a logic ’0’ VCSEL current i0 = 1.6mA and i1 currents 3mA
and 5mA.
The stochastic model of equations 4.9 and 4.10 is used as a model for
OMA (i0, i1) and AVG (i0, i1). To account for photodiode responsivity and
receiver amplification variations, an additional per-channel normal deviation
0.29dB (extracted in section 4.4.3) is implicitly added jointly to OMA[dBm]d and
AVG[dBm]d . Equation 4.20 is evaluated using a Monte Carlo method; this is
treated in appendix D.2.
At VCSEL currents i0 = 1.6mA and i1 = 3.0mA and without inserted op-
tical attenuation, the probability of a BER < 10−12 on one channel using a
threshold dictated by another channel is only 43%. The standard deviation
of β/oma[mW] is around 14%. These figures still ignore process variations
of limiting amplifier operating points—a bias effect which is automatically
compensated when per-channel threshold feedback is used. At a fixed laser
bias i0, higher modulation amplitudes yield higher probabilities as the mod-
ulation current increases faster than the threshold error, especially when
OMA is small compared to AVG. Figure 4.26 shows the array-wide yield as
a function of the number of channels nd at different desired BER limits and
modulation currents. It is clear that, for the characteristics extracted from the
IO project setup, for any decent number of channels the usage of a common
logic threshold is infeasible.
116 Statistical OE-VLSI modeling and characterization
Chapter 5
Measurement and
evaluation of direct
substrate noise
As discussed in section 2.6.2, optical receiver circuits need to amplify pho-
tocurrents in the µA–range to rail-to-rail voltage swings. The large parasitic
capacitance of photodiodes translates voltage noise across them into current
noise which can overwhelm the communication signal. Although a differ-
ential circuit approach can mitigate this problem, the circuit itself must be
adequately protected as well against unbalanced disruptions. When optical
receiver circuits are integrated in a chip alongside each other and rapidly
switching digital circuits—as is the case in OE-VLSI interconnect—coupling
of digital switching noise through the substrate can severely compromise
the accurate operation of the receiver circuits. Besides possible OE-VLSI is-
sues, substrate noise is an important issue in all mixed-signal designs where
sensitive analog circuits are embedded in a hostile digital environment.
In this chapter we present an experimental environment to characterize the
sensitivity of embedded analog circuits to digitally generated substrate noise.
Our measurement technique is based on equivalent-time substrate voltage
sampling and uses a simple differential latch comparator without explicit
input sample-and-hold. A surprisingly large measurement bandwidth is
observed, which is explained from the detailed circuit behavior. On the IO
project receiver IC in 0.18–µm CMOS, we have demonstrated that our system
allows to wave trace pulses as narrow as 200ps accurately. It turns out that
the protection of the receiver circuits by a guard ring approach—surrounding
the circuits with a dense succession of interconnected substrate contacts—is
adequate to absorb locally generated substrate disruptions.
Additionally, the extraction of precise measurement data from observations
118 Measurement and evaluation of direct substrate noise
that are excessively corrupted by additive noise and timing jitter is addressed.
We present simple yet very effective methods to accurately reconstruct pulse
waveform features without the use of delicate deconvolution operations.
5.1 Introduction
Mixed-signal designs often need to embed sensitive analog circuits in between
fast digital circuits. Coupling of digital switching noise through the substrate
can then severely compromise the analog behavior. Careful separation of
both circuits at the circuit and layout levels—using highly decoupled separate
power and ground nets and a layout with minimal coupling between sensitive
metal tracks—leaves only coupling through the conductive substrate as the
remaining noise mechanism. Unless costly triple-well CMOS technology is
applied, this coupling cannot be eliminated completely. The prediction of
substrate currents and the resulting bulk potential variations and ground
bounce, and the design of noise-tolerant circuits are active research topics
[Nagata et al., 2001; Donnay and Gielen, 2003; Badaroglu et al., 2004; Owens
et al., 2005; Badaroglu et al., 2006].
In this chapter, we report on a measurement technique for very fast periodic
bulk potential variations in bulk-type CMOS. The approach was applied on
the 0.18 µm receiver test IC of the IO project (see appendix A), where we
need to quantify the robustness of sensitive optical receiver circuits to locally
injected substrate noise. To this end, we have designed controllable substrate
noise generation circuits and modeled the substrate coupling using Cadence’s
Substrate Noise Analyst (SNA) software [2004]. Our measurement circuit for
fast substrate noise has been integrated alongside the noise generation circuits
as a means to validate the substrate model through direct measurement. The
measurement circuit is analyzed and a surprisingly small sampling window
with a correspondingly large effective bandwidth are observed. Furthermore,
a new and efficient technique was developed to accurately extract pulse
widths and heights from measurement data deeply buried in jitter noise.
The proposed technique is simple and does not require numerically delicate
deconvolution operations.
Our approach is based on the equivalent time measurement technique [Makie-
Fukuda et al., 1996; Ho et al., 1998; Nagata et al., 2000; Casper et al., 2003]
and hence is applicable when the generated noise is periodic. Here, the
substrate signal is sampled at different phases in subsequent periods. Each
sample is compared to an adjustable reference voltage using a latch com-
parator. By adapting the reference voltage for each value of the sampling
delay, the waveform of the signal can be reconstructed. The result of the
comparison is observable at a standard digital I/O pad, obviating the need for
high-bandwidth analog I/O. The repetitive sampling principle has far wider
applicability than substrate noise measurements, and most results will apply
5.2 Problem Denition and Previous Work 119
to the measurement of any reproducible signal.
We have compared our substrate noise measurements to circuit-level sim-
ulations. To this end, we have used SNA to extract a three-dimensional
RC-network as a simulation model for the substrate. For simulation purposes,
the substrate model was then coupled to the IC schematic, which has been
specially modified to bring all significant substrate connections—CMOS de-
vice backsides and explicit substrate connections—outside as pins. A good
match between simulations and measurements was obtained.
This chapter is structured as follows. In section 5.2, we briefly discuss the
causes and effects of substrate noise, the noise measurement principle, pre-
vious work and the original contributions. Next, the measurement circuit is
presented. The following sections take a closer look at the properties of the
latch comparator. In section 5.3, the dynamic behavior of the circuit is studied,
and in section 5.4 we investigate measurement inaccuracies resulting from
a noisy voltage reference or sampling time jitter. Finally, in section 5.6, we
report on the good agreement between measured and simulated substrate
disturbances, the observed impact of direct substrate noise on the receiver
circuits, and on the performance of our noise elimination technique.
5.2 Problem Denition and Previous Work
5.2.1 Substrate noise
The term substrate noise comprises all effects caused by the switching of digital
circuit nodes, which change the bulk potential underneath sensitive devices of
the analog circuit, or inject current into substrate contacts. There are several
sources of substrate noise. The most important reported direct causes [Su
et al., 1993; Donnay and Gielen, 2003; Badaroglu et al., 2004] are capacitive
coupling from MOSFET source and drain nodes and impact ionization from
the MOSFET channel—a carrier generation process where highly energetic
carriers collide with the crystal lattice—which inject current into the substrate.
The effect on the substrate is characterized by short, sharp pulses occurring
nearly simultaneously with the switching events that cause them. Substrate
voltage transients have very specific properties: the voltages typically take
values in an interval of only some tens of mV around the off-chip reference
ground, and the output impedance of a floating bulk contact is relatively high
[Su et al., 1993]. The bulk voltage instantaneously affects the threshold voltage
of the involved MOSFETs, and capacitive coupling with various circuit nodes
is omnipresent.
A major indirect cause of substrate disturbances originates from the power
supply system and is known as ringing. When an external point on a PCB
ground plane close to the IC is used as a common reference for the separate
analog and digital power distribution, the digital on-chip power and ground
120 Measurement and evaluation of direct substrate noise
-50
-40
-30
-20
-10
 0
 10
 20
 30
 40
 0  1  2  3  4  5  6  7  8  9  10
vo
lta
ge
 (m
V)
time (ns)
direct distortion
low-frequency ringing
high-frequency ringing
Figure 5.1: Observed direct and indirect substrate disturbance waveforms. Note the
large bandwidth differences between both effects.
lines will exhibit voltage oscillations with respect to this point, and hence
with respect to the analog ground system. These oscillations are caused by
the sharp supply and ground current fluctuations injected into the power
system. This system consists of an RLC resonator composed of the on-chip
decoupling capacitance, and the inductance and resistance of the digital power
and ground lines (bond wires). Even when digital and analog ground lines
are kept fully separate on-chip, there exists a relatively low-impedance path
between them through the substrate, because they contact the same p-bulk
in many places—e.g., at least once in each digital cell. Hence, the ringing of
the digital power system is coupled to the on-chip analog ground reference,
disturbing the analog circuitry. The frequency behavior of ringing is quite
distinct from the effects from the direct causes (figure 5.1), in that the observed
frequencies range between 50 and 500MHz whereas direct substrate noise has
frequency components in the GHz range. In the literature, most attention goes
to the effects of ringing. Indeed, the ringing effect is larger in amplitude and
time scale than the direct effect and considerable effort has been invested in
neutralizing its effects [Donnay and Gielen, 2003]. The ringing effect is much
less apparent when packaging parasitics are reduced, for instance when the
IC is being flip-chipped instead of wire bonded [Owens et al., 2005]. Here
we address the direct substrate disturbances, which are far more difficult
to measure than the effects caused by ringing because they require a much
higher measurement bandwidth.
We consider bulk-type (lowly doped) substrates, which are quite common
in modern technologies. They have fairly high resistivity and cannot be
considered equipotential; hence they must be modeled as a three-dimensional
RC network. They provide better noise isolation than EPI substrates, as guard
ring structures are far more effective [Su et al., 1993].
5.2 Problem Denition and Previous Work 121
Several tools exist for the extraction and simplification of the equivalent RC
network; we have used the SNA software for this purpose. The extraction
and simplification of the equivalent network from a digital circuit with many
substrate connections is a tedious and computationally complex task, which
quickly runs into long computation times as circuit complexity increases. In
this respect, the availability of small embedded measurement circuitry is a use-
ful complement to pure simulation. In an attempt to predict the experimental
results to validate the measurement technique, we have simultaneously mod-
eled a configurable noise generation circuit together with our measurement
circuit.
5.2.2 Measuring fast substrate signals
Substrate voltage wave tracing has never been straightforward: direct electrical
on-chip probing is generally impossible, and in addition the inductance and
large capacitive load of a single-ended external probe disrupt the signal. In an
indirect approach, substrate noise has been measured through its impact on
the threshold voltage of a single MOSFET [Su et al., 1993]. The bandwidth
of this method is severely limited by probing parasitics. Integration of a fast
wideband analog amplifier and bonding out the amplifier output is another
technique [Donnay and Gielen, 2003; Owens et al., 2005], which however
requires the use of very high-bandwidth analog output pads.
On-chip real-time sampling of signals that vary with the achievable speed
of the used IC technology is infeasible with circuits of the same technology:
either the sampling circuits need to take samples impossibly fast, or a very
large number of parallel sampling circuits would be required. Furthermore, a
very large bandwidth to store the measurement data or to bring them off-chip
is required. However, if the longer acquisition time is no problem and if
the signals are periodic, equivalent time sampling measurements using on-chip
voltage comparators constitute a simple and low-cost alternative. Sampling
can be done over subsequent instances of the signal, with only one sampling
circuit. The phase between the periodic waveform and the sampling time is
made adjustable by on-chip or external means, in order to gather samples
throughout the whole signal period. Repetitive sampling also eliminates
the need for analog-to-digital conversion as a simple, but repeated on-chip
comparison with an externally provided analog voltage will suffice. The result
of the comparison is a low-bandwidth binary signal. There is no restriction
on the complexity or duration of the repetitive experiment as long as the
repetitions are accurate and deterministic (see Casper et al. [2003] who used a
full pseudorandom digital pattern as stimulus in one repetitive experiment).
Makie-Fukuda et al. have used chopper-type single-ended voltage comparators
[Makie-Fukuda et al., 1996] to sample substrate voltages in equivalent time.
The single-ended nature of this circuit makes the comparison itself vulnerable
to MOSFET threshold variations and power supply ringing. A differential and
122 Measurement and evaluation of direct substrate noise
more robust approach was taken by Nagata et al. [2000] where differential
latch comparators are employed. A latch comparator is a circuit that performs a
clocked comparison of two voltages. Such circuits are widely deployed in A/D
converters [Yuwaka, 1985; Cho and Gray, 1995; De Maeyer et al., 2004] and
dynamic RAM circuits [Montanaro et al., 1996]. Latch comparators have also
been applied successfully for on-chip sampling of other signals for diagnostic
purposes, e.g., to verify the timing of a SRAM circuit [Ho et al., 1998] or to
monitor the eye diagram of a communication link [Casper et al., 2003].
Our experimental setup was inspired by Nagata et al. [2000]. However,
our approach is different from theirs as we aim at observing the fast direct
substrate noise effect instead of the ringing. Where there was plenty of
measurement bandwidth for the observation of ringing, now the bandwidth
of the examined signals becomes much higher. Hence analyses of the intrinsic
measurement bandwidth and the effects of additive noise and jitter were
considered essential. Observing the direct substrate noise in high-resistive
substrates also requires full control over the relative location of noise sources
and detectors, while in EPI-type substrates, distances of more than about 4
times the thickness of the epitaxial layer are considered to exhibit equal noise
coupling [Su et al., 1993]. Our noise generation circuit provides very localized
substrate noise sources, and the location of our detector circuit is judiciously
chosen. The detector itself is very simple: it requires only one comparison
voltage and only one measurement trigger clock. Figure 5.2 shows the circuit
diagram. The measurement bandwidth is surprisingly large and does not, as
is commonly believed, depend on the time constant of the regenerative latch.
The latch comparator is modeled after the sense amplifier design for the
memory cells of the StrongArm microcontroller [Montanaro et al., 1996]. It
essentially consists of a bistable cross-coupled latch (transistors T4 to T7) fed
by a differential stage consisting of transistors T1 to T3. When sampling, the
latch is released from its metastable state with an initial imbalance resulting
from the differential stage. This imbalance reflects the difference between the
reference voltage and the substrate signal; the sign of the difference will cause
the latch to move to the corresponding stable state.
5.3 Estimating bandwidth and linearity of the
latch comparator
Let us now take a closer look at the behavior of the latch comparator aiming at
a more precise characterization of its bandwidth properties. In equivalent time
sampling measurements, usually a sample/hold circuit is used to capture the
signal to be measured. The sampling circuits used are linear circuits with very
good approximation, and their equivalent bandwidth is determined by their
aperture time, i.e., the time window T during which the signal is observed
and somehow averaged to provide a single sample value.
5.3 Estimating bandwidth and linearity of the latch comparator 123
supply
T1
T2 T3
N2
T4 T5
T6 T7
N3
N1VG2 VG3
N4 N5
(Vref)
(clock) VG1
(offset )Vin
comparison
output Y
Figure 5.2: Circuit diagram of the latch comparator used in our measurement circuit.
clock
VG2 VG3
Y
VG1
comparison
output
latch
comparator
Vsub Vref
equivalent
substrate
resistance
bias V0
Figure 5.3: Substrate and reference connections of the measurement circuit, simplified
for our empirical analysis.
124 Measurement and evaluation of direct substrate noise
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
0 50 100 150 200 250 300 350
vo
lta
ge
 (V
)
t0 t1 t2
VG2
V =VG3 ref
VG1
VN1
VN5
VN4
VN2
VN3
time (ps)
Figure 5.4: Simulated operation of the latch comparator. The voltage Vref was chosen
such that the latch was released very close to its metastable equilibrium point. Note
that the voltage VG2 undergoes a marked voltage drop during the measurement, due
to capacitive voltage kickback from node N1 into the finite substrate impedance. The
actual substrate signal to be measured is superposed onto VG2 and is not discernable
on the scale of the figure.
In our system, no explicit sample/hold circuit is deployed. However, it is
obvious that during a brief period immediately following a positive clock
transition, the input signal is observed and compared to the reference value,
causing the latch to move to its final state. Consequently, a sampling function
similar to the explicit ones found in traditional circuits must be present. It is
much less obvious to find out whether the ensuing sampling action can be
considered a linear operation, and if so, to determine its equivalent bandwidth.
Indeed, during operation, all transistors operate in their large signal regime,
where there exists no simple linear relationship between e.g. the gate voltage
and drain current of a FET. Hence a more detailed analysis is called for.
5.3.1 Bandwidth
Figure 5.4 shows the simulated operation of the circuit when an input voltage
is sampled, and where the reference voltage is set to a value that releases the
latch very close to its metastable operating point. This is the situation we
achieve in noise-free conditions when the measurement outcome contains 50%
1s and 50% 0s. From the time behavior of the circuit, the sampling operation
can be inferred. Initially, when the clock input is still low, transistor T1 is
switched off, and the drains of transistors T2 and T3 are all pulled high. Node
N1 assumes an intermediate voltage slightly below VG2. When the clock signal
rises above threshold (epoch t0), T1 starts conducting, pulling the common
source node N1 low. At the same time, the negative transition on node N1
induces a negative transition on VG2 due to capacitive coupling and the fact
5.3 Estimating bandwidth and linearity of the latch comparator 125
that the gate of T2 is coupled to the high-ohmic substrate (voltage kickback).
In our case, we assume that this kickback effect is signal-independent and
hence will result in an overall offset.
At time t1, when node N1 has dropped Vt below the gate voltages of T2 and
T3, respectively, a drain current starts to flow in these transistors. This current
discharges the respective drain nodes N2 and N3, bringing them low. The
discharge rates are proportional to the respective drain currents, which, in
turn depend on the instantaneous gate-to-source voltages of the transistors T2
and T3.
Due to the differences between the gate voltages of T2 and T3, the nodes
N2 and N3 are discharged at a different rate, and a voltage difference builds
up between them. At time t2, when both nodes have dropped sufficiently
low, the transistors T4, T5, and later T6, T7 start conducting, activating the
positive feedback loop of the latch, which will exponentially move to a stable
state corresponding to the initial voltage difference on nodes N2 and N3.
As already mentioned, during actual measurements the reference voltage is
adjusted until this initial voltage difference is as small as possible, releasing
the latch as close as possible to its metastable operating point.
Hence, the measurement value is the specific value of the reference voltage
which leads to as small as possible a voltage difference between nodes N2
and N3 at time t2, and the actual buildup of this difference (i.e., the period
during which the input and the reference are compared) takes place between
t1 and t2. This results in a sampling period T = t2 − t1, which determines
the equivalent bandwidth of the measurement. As can be seen in figure 5.4,
this sampling window can be much smaller than the regeneration time of the
latch.
5.3.2 Linearity
The question remains as to whether the equivalent relationship between
the input voltage and the measurement result is linear or sufficiently close
to linear, so that accurate waveform measurements can be performed. To
investigate this question, we use a semi-empirical approach, inspired by the
simulation of figure 5.4. Hereto, we determine the incremental effect of the
small substrate signal when observed during the measurement operation.
We model the difference in the drain currents Idiff of T2 and T3 during the
sampling of a sinusoidal waveform Vin = V1 sin (ωt+ φ), which represents
the net incremental effect of the substrate voltage onto the gate of T2. V1 is a
small amplitude (e.g., 10mV), typical of direct substrate disturbances. We set
126 Measurement and evaluation of direct substrate noise
the time origin to epoch t1 and approximate the relevant node voltages by
VG2 = V0 +Vin +V2e−λt (5.1)
VG3 = Vref (5.2)
VN1 =
(
Vref −Vt
)
e−λt, (5.3)
where V0 is the DC-bias of the gate of T2 and the last term in equation 5.1
approximately represents the voltage kickback onto the gate of T2, clearly
visible in figure 5.4. Equation 5.3 approximates the downward transition
of the common source voltage at node N1 from the point where T3 starts
conducting. The rate λ represents the time constant in the transition of
node N1 and was extracted from the simulation result of figure 5.4. These
waveforms are simple approximations, but are chosen such that they allow
symbolic manipulation of the equations. For our analysis we assume a simple
transistor model, in which the drain current of a FET in its saturation region
is proportional to (VGS −Vt)2. The results of our analysis are essentially
independent of the detailed FET characteristics or the precise waveforms of
VG2, VG3 and VN1 because we only need to consider small deviations from the
actual voltage trajectories rather than the trajectories themselves. Therefore,
any more sophisticated model will lead to the same qualitative conclusions.
Using these definitions and assumptions, we obtain
Idiff ∝ (VG2 −VN1 −Vt)2 − (VG3 −VN1 −Vt)2 , (5.4)
and the resulting voltage difference between N2 and N3 at time t2 is propor-
tional to
Vdiff ∝
∫ t2
t1
Idiff dt. (5.5)
We now set Vdiff = 0 in equation 5.5, and solve this equation for Vref sym-
bolically1. The complicated expression for Vref thus obtained represents
the measurement outcome, and its dependence on the amplitude V1, fre-
quency ω and phase φ of the input signal will reflect the linearity or the lack
thereof. A perfectly linear response should result in Vref having the form
Vref = a (ω) sin (ωt+ φ− φ0 (ω)). The value of ω where a (ω) = V1/
√
2 is
the equivalent -3–dB bandwidth of the circuit. The actual expression obtained
is however much more complicated.
To estimate the degree of nonlinearity, we have expanded Vref into a Taylor
series w.r.t. V1, and for each value of ω we have computed the maximal
value of the second order term for varying φ. Numerical values were chosen
according to figure 5.4: V0 = 900mV, Vt = 500mV, V2 = 200mV, 1/λ = 7.5ps,
and T = t2− t1 = 20ps. Fig. 5.5(a) shows a Bode plot of a(ω) and the maximal
1This was done using Maplesoft’s Maple 9.5 suite. Although results were obtained in closed
form, they are far too complex to be shown explicitly.
5.3 Estimating bandwidth and linearity of the latch comparator 127
-70
-60
-50
-40
-30
-20
-10
0
1 10 100
ga
in
 (d
B)
frequency (GHz)
total output
first-order term
second-order term
-360
-315
-270
-225
-180
-135
-90
-45
0
ph
as
e 
(de
gre
e
s)
(a) Simplified analysis
-70
-60
-50
-40
-30
-20
-10
0
1 10 100
-360
-315
-270
-225
-180
-135
-90
-45
0
ga
in
 (d
B)
ph
as
e 
(de
gre
e
s)
frequency (GHz)
gain
A/f 2.18
phase
(b) Transistor-level simulation
Figure 5.5: The amplitude response of the latch comparator. (a) The response and
its second order term based on our empirical analysis (V1 = 0.1Volt). (b) Obtained
through simulation of the circuit with sinusoidal input.
128 Measurement and evaluation of direct substrate noise
value of the second order term, for a signal amplitude V1 = 10mV. The −3-
dB bandwidth is roughly 25GHz, and the second-order term initially stays
more than 70dB below the signal, reducing to 38dB at 25GHz. The circuit’s
frequency response was also directly obtained from a simulation run. To this
end, transient simulations were performed with a sinusoidal input with a
10–mV amplitude. For each frequency, the sampling phase was set to five
equally spaced values in the period and the metastable Vref was determined.
Through the five Vref results, a sine wave was fitted, yielding the amplitude
and the phase of the response. The resulting Bode plot is shown in fig. 5.5(b).
As both the bandwidth and the first notch depend on the sampling time,
we can conclude that the estimate of T from our empirical circuit analysis
provides a very good fit. Furthermore, even at input amplitudes as large as
100mV, the second-order term is still 18 dB below the first order term over the
entire bandwidth. Hence we can conclude that our circuit, despite the lack
of sample/hold circuits, is linear with good approximation in the amplitude
range for our purpose and has excellent bandwidth.
5.4 Measuring in the presence of jitter and addi-
tive noise
We now address the problem where we intend to measure the sharp, isolated
pulses caused by capacitively coupled substrate events, but where the horizon-
tal time base is subject to significant time jitter. It is well known that timing
jitter can drastically obscure the real pulse shape in equivalent time mea-
surements, and should be eliminated. Several authors [Gans, 1983; Souders
et al., 1990; Verspecht, 1994; Coakley et al., 2003] have presented methods to
eliminate both the jitter and additive-noise effects from measurements, under
various hypotheses about the true waveform and the statistical properties of
the noise components. However, all approaches assume that the raw measure-
ment result is an analog value, corrupted by both noise components, for each
sampling instant.
In our measurement system, raw measurement data are binary (1s and 0s),
and indicate the result of the individual comparisons of the substrate signal
with the analog voltage Vref . As indicated in the previous section, when we
want to measure the instantaneous signal value, we set Vref to a value which
yields a 50–50% result on the comparison; this value then represents the
analog value of the measurement. Setting other values for Vref will yield other
relative outcomes of the comparison. In noise-free conditions, the change-over
from a 0–100% to a 100–0% is immediate, while in our test setup, in nearly
noise-free conditions, the change-over occurs in a very narrow interval (less
than one mV). In these conditions, with sharp transitions, a fast simple time
scan with adjustment of Vref to a 50–50% outcome (e.g., using successive
approximations) will provide an accurate measurement result.
5.4 Measuring in the presence of jitter and additive noise 129
In the presence of both timing jitter and additive noise, the change-over
interval broadens and the 50% percentile is no longer a good approximation
of the true waveform. Performing a repeated measurement at time delay
t, with a fixed value of Vref yields a fraction of 1s denoted m
(
t,Vref
)
. This
fraction is an unbiased estimate of the probability that the instantaneous
substrate signal value is larger than the fixed Vref . Collecting m
(
t,Vref
)
over a relevant range of values of t and Vref would allow the application of
published deconvolution techniques [Lucy, 1974; Richardson, 1972] to extract
the original uncorrupted waveform. However, the complete measurement is
very time-consuming, in particular when not the entire waveform is required,
but only its main characteristic features such as pulse position, height and
width. Furthermore, deconvolution techniques are notably unstable in ill-
conditioned situations. We shall now describe a new approach aimed at
identifying the true relevant pulse features without having to collect the entire
dataset and without having to identify the jitter distribution and deconvolve
it from the measurements. It is based on collecting a small number of time
scans of m
(
t,Vref
)
at well-chosen fixed values of Vref , based on an initial 50%
percentile scan.
We use the following notation. Let v (t) denote the true voltage waveform,
representing an isolated, sharp pulse. The random variable τ represents the
jitter, and is assumed to have a probability density function (pdf) fτ (t). The
additive noise n present during each individual sample is assumed the be
drawn independently from a zero-mean distribution, and is assumed to have
a standard deviation that is significantly smaller than the signal peak value.
We shall first perform our analysis with zero additive noise (which may be a
valid approximation in many cases).
Let t1 (v0) and t2 (v0) denote the time instants of the rising and falling crossing
of the level v0 by v (t), respectively. With these assumptions, the expected
result of a horizontal scan with Vref = v0 can be written as follows:
E [m (t, v0)] = P {v (t− τ) > v0} (5.6)
=
∫ +∞
−∞
fτ (u)P {v (t− u) > v0} du (5.7)
=
∫ +∞
−∞
fτ (u) I[t1(v0),t2(v0)] (t− u) du, (5.8)
where E denotes expectation and where I[x,y] (t) is the indicator function of
the interval [x, y] taking on the value 1 iff x ≤ t ≤ y, and 0 otherwise.
Equation 5.8 represents a convolution of the jitter distribution fτ and the
(deterministic) indicator function of the true pulse width at level v0. From this
convolution, one can extract the pulse position and width with good precision,
using Fourier techniques. As indicated, whenever the Fourier transform of
130 Measurement and evaluation of direct substrate noise
fτ (t) vanishes in the region of interest, its removal by deconvolution is not
as straightforward as it looks. A much simpler approach in the time domain
consists in simply integrating equation 5.8. Assuming we are considering one
single pulse, we obtain
M0 (v0)
=
∫ +∞
−∞
E [m (t, v0)] dt (5.9)
=
∫ +∞
−∞
dt
∫ +∞
−∞
fτ (u) I[t1(v0),t2(v0)] (t− u) du (5.10)
=
∫ +∞
−∞
fτ (u) du
∫ +∞
−∞
dt[t1(v0),t2(v0)] (t− u) (5.11)
= (t2 (v0)− t1 (v0))
∫ +∞
−∞
fτ (u) du (5.12)
= t2 (v0)− t1 (v0) . (5.13)
Computing the first moment in the time domain, we find
M1 (v0)
=
∫ +∞
−∞
tE [m (t, v0)] dt (5.14)
=
∫ +∞
−∞
t dt
∫ +∞
−∞
fτ (u) I[t1(v0),t2(v0)] (t− u) du (5.15)
=
∫ +∞
−∞
fτ (u) du
∫ +∞
−∞
tI[t1(v0),t2(v0)] (t− u) dt (5.16)
=
∫ +∞
−∞
fτ (u) du
(t2 (v0) + u)
2 − (t1 (v0) + u)2
2
(5.17)
= (t2 (v0)− t1 (v0)) t2 (v0) + t1 (v0)2 , (5.18)
where we have assumed that the jitter distribution has zero mean. Both
results jointly provide us with position and width estimates of the pulse at
voltage level v0. In view of the large number of samples taken (more than 106
per combination of t and v0), the observed value of m (t, v0) is a very good
unbiased estimate of its expected value. The values of M0 (v0) and M1 (v0)
can hence be estimated as follows:
Mˆ0 (v0) = ∆t
N
∑
i=1
m (ti, v0) (5.19)
Mˆ1 (v0) = ∆t
N
∑
i=1
tim (ti, v0) , (5.20)
5.4 Measuring in the presence of jitter and additive noise 131
where ∆t is the sampling time step and N is the number of time steps. Then,
the pulse width and position are estimated as
pulse width ≈ Mˆ0 (v0) (5.21)
pulse position ≈ Mˆ1 (v0)
Mˆ0 (v0)
. (5.22)
Let us now analyze how additive noise enters the picture. We extend equation
5.6 considering the noise n additive to v0, and take expectations with respect
to its distribution:
E [m (t, v0)]
=
∫ +∞
−∞
fn (x) dx P {v (t− τ) > v0 + x} (5.23)
=
∫ +∞
−∞
fτ (u)
∫ +∞
−∞
fn (x) dx P {v (t− u) > v0 + x} du (5.24)
Recalculating the time integrals in equations 5.10 and 5.15, we obtain the
following results:
M0 (v0)
=
∫ +∞
−∞
E [m (t, v0)] dt (5.25)
=
∫ +∞
−∞
fn (x) (t2 (v0 + x)− t1 (v0 + x)) dx (5.26)
M1(v0)
=
∫ +∞
−∞
tE [m (t, v0)] dt (5.27)
=
∫ +∞
−∞
fn (x)
t22 (v0 + x)− t21 (v0 + x)
2
dx. (5.28)
At first sight, these equations do not allow to directly extract the pulse width
and position at a certain v0 as prior knowledge of the full pulse waveform
v (t) would be required. In our approach, we solve this problem by locally
approximating the pulse by its series expansion. We distinguish two important
cases.
Firstly, we set v0 to half the pulse height. Here, the curvature of the pulse’s
edges is small, and the edge can be well-approximated by its tangent:
ti (v0 + x) = ti (v0) + t′i (v0) x + O
(
x2
)
, where the prime denotes taking
derivatives. In this case, the integral of equation 5.26 collapses to equation
5.13, hence equation 5.21 is still valid. Equation 5.28 reduces to
M1 (v0) = (t2 (v0)− t1 (v0)) t2 (v0) + t1 (v0)2
+ σ2
(
2
2 (v0)− 21 (v0)
)
+O
(
E[n3]
)
, (5.29)
132 Measurement and evaluation of direct substrate noise
where σ2 denotes the variance of n. Obviously, equation 5.22 now gives an
asymptotically biased estimation of the pulse position. For small values of σ
(for example, for the value of 1mV we have observed in our experiments), it
turns out that the resulting bias term is negligible (≈ 0.1ps for a pulse width
of 160 ps), and we can safely ignore it. For larger values of σ, a correction is
needed. A procedure for this is provided in appendix D.3.
Secondly, we may want to estimate the pulse height vtop. To this end, we
approximate the pulse near the peak. Here, the pulse waveform v (t) may be
approximated by a parabola p (t):
p (t) = vtop − d2
(
t− ttop
)2 , (5.30)
where ttop and d represent the characteristic parameters, which we can deter-
mine from a least-squares fit with the measurement data. This is described in
appendix D.4.
5.5 Simulation setup
Circuit-level simulation of direct substrate noise is performed by replacing
the idealized substrate model—one equipotential ground net—by one incor-
porating resistive and capacitive effects. We have used SNA (SNA 3.2/Sub-
strateStorm A3.6b) to extract such a three-dimensional distributed RC model
of the substrate. Figure 5.6 illustrates the procedure.
Technology characterization The SNA/SubstrateStorm software has to be
calibrated for the CMOS technology being used. At the very least, knowledge
of resistivity or—equivalently—doping concentration is required throughout
vertical cross-sections of several regions: p-bulk and n-well below field oxide,
below gate oxide and below contacts. Using the technology characterization
tool, a number (5–15) of depths below the CMOS surface is fixed. The
resistivities of the different region cross-sections between the discretized
depths are then calculated from the resistivity (or doping concentration)
profiles. The specific junction capacitance at the bottom of n-wells is deduced
as well. The results are stored in a technology description file.
Adaptation of layout and schematics We need to move away from the single-
node substrate model. New nodes are created for the backside substrate
connection of transistors. To this end, in the schematic, the ‘fourth’ MOS
terminals are disconnected from the ground or supply net and brought outside
as pins. In the layout, corresponding labels for the new nodes are placed on
p-bulk and n-well node-labeling layers that are specifically created for this
purpose. In order to logically isolate different nodes in the p-bulk or n-wells,
5.5 Simulation setup 133
La
yo
ut
 w
ith
su
bs
tra
te
 n
od
es
n
o
de
 la
be
ls
re
ct
a
n
gl
es
 o
n
w
e
ll 
se
pa
ra
tio
n 
la
ye
r
Sc
he
m
at
ic 
wi
th
su
bs
tra
te
 n
od
es
Su
bs
tra
te
-a
wa
re
 s
im
ul
at
io
n
O
rig
in
al
 la
yo
ut
O
rig
in
al
 s
ch
em
at
ic
Su
bs
tra
te
 a
bs
tra
ct
 v
ie
w
Si
m
pl
ifie
d 
RC
 n
et
wo
rk
Ex
tra
ct
ed
3D
 R
C 
ne
tw
or
k
Ex
tra
ct
ed
 la
yo
ut
 v
ie
w
H
or
iz
on
ta
l
su
bs
tra
te
di
sc
re
tiz
at
io
n
n
-w
e
ll 
re
gi
on
a
cc
e
ss
 p
or
ts
Ve
rti
ca
l
su
bs
tra
te
 d
isc
re
tiz
at
io
n
CM
O
S 
te
ch
no
lo
gy
 d
at
a
M
an
ua
l l
ay
ou
t/s
ch
em
at
ic 
ad
ap
ta
tio
n
LV
S 
wi
th
 a
da
pt
ed
 e
xt
ra
ct
io
n 
ru
le
s
Substrate view
generator
Te
ch
no
lo
gy
ch
ar
ac
te
riz
at
io
n 
to
ol
Mesh generator
R
C 
ne
tw
or
k 
ex
tra
ct
orR
C 
ne
tw
or
k 
re
du
ct
or
To
pl
ev
el
 c
om
bi
na
tio
n
o
f s
ch
em
at
ics
Fi
gu
re
5.
6:
Fl
ow
di
ag
ra
m
fo
r
su
bs
tr
at
e-
aw
ar
e
si
m
ul
at
io
n
ap
pl
ie
d
on
a
du
m
m
y
de
si
gn
co
ns
is
ti
ng
of
a
si
ng
le
in
ve
rt
er
.
134 Measurement and evaluation of direct substrate noise
a special well separation layer is declared. The outlines of shapes in this layer
then represent the boundaries between the nodes.
For reasons of computational complexity, we do not want substrate modeling
for small or quiet transistors or those that are far from the region of interest.
An obvious first step is limiting the layout area to be analyzed. Additionally,
a new device property layer is declared for labels that are placed on transistor
gates with the text ‘extract=value’. In a later stage, substrate modeling can
be selectively enabled or disabled for transistors depending on this attached
value.
Substrate contacts generally need no special attention and automatically
transfer nodal connectivity from the metal net that they connect to to the
substrate. Only when a substrate contact is used as a kind of probe—as in our
noise measurement circuit—should the metal track that the contact connects
to be labeled with a new node name instead of the ground node, both in the
schematic and in the layout.
Layout extraction The next step is the generation of an extracted layout
view. This is a layout view where devices (transistors, resistors, capacitors,
. . . ) are identified in the layout and interconnecting shapes on the same layout
layer are merged. Additionally, all device terminals and layout shapes are
associated with node names that are derived from node labels in the layout
and a set of rules for inter-layer connectivity. Devices having a label on them
in the device property layer, are associated with the attributed value for the
extract property.
We have used Cadence Assura layout-versus-schematic (LVS) for the gener-
ation of the extracted view. The extraction script from the design kit was
modified to take the new node-labeling, well separation, and device property
layers into account. Another modification was the inclusion of substrate con-
tacts in the extracted view—they are normally removed from an extracted
view. In this technology, substrate contacts correspond to very dense arrays of
micron-sized squares. In the extracted view, these arrays are replaced by their
envelope.
Substrate view generation The extracted view will now be used to gen-
erate the surface abstract view (SAV). This is a layout view, generated by
SubstrateStorm, where only wells, substrate contacts and device gates are
indicated. Each point within the SAV then corresponds to one of the verti-
cal region cross-sections as defined in the technology characterization stage.
Three new layers need to be created for the SAV: a region layer for n-wells, an
access port layer for contacts and gates, and a special macro layer to indicate
parts of the layout that need not be modeled very accurately. In the access
port layer, all shapes are associated with their node names and port types
(contact or gate). A configuration file is used to indicate layer correspondence
5.6 Experimental Results 135
between extracted view and SAV, and to select the devices taking part in the
substrate modeling based on their extract property.
3D RC network extraction The SubstrateStorm extraction tool now gen-
erates a mesh to make a two-dimensional ‘horizontal’ discretization of the
CMOS. In a piece of layout covered by a macro layer shape, this discretiza-
tion always remains coarse and different substrate contacts with the same
node name may be treated as one contact. Combined with the vertical dis-
cretization of the technology description file, a 3D finite element model of
the substrate is constructed. The resistivities and capacitances stored in the
technology description file are used with the finite element model to generate
a three-dimensional RC network as substrate model.
RC network reduction The 3D RC network typically contains far too many
nodes for a feasible circuit-level simulation. Even in the simple case of the
single inverter of figure 5.6, the network already consists of 2085 nodes, 4506
resistors and 338 capacitors. The SubstrateStorm RC reduction tool performs
a reduction of the number of internal nodes based on a pole analysis of the 3D
RC network. The reduction technique [Kerns and Yang, 1997] retains all poles
from DC to a specified upper-frequency limit. For the simple inverter, the
network reduces to 69 nodes, 64 resistors and 246 capacitors at a 5–GHz setting
(the simplified network in figure 5.6 is even more reduced). The reduced
network is written out as a spice netlist.
Joining schematics and substrate model In a top-level schematic, a substrate-
enhanced symbol for the adapted design schematic is coupled with a symbol
representing the substrate model. In the simulation environment, the reduced
RC network model needs to be included by the main simulation netlist. At
this stage, a substrate-aware circuit-level simulation can be performed.
5.6 Experimental Results
5.6.1 Measurement setup
Figure 5.7 shows an overview of our measurement setup. The 0.18–µm CMOS
IO project receiver IC (containing twelve optical receivers) is augmented with
a noise generation and measurement circuit at the top and bottom. Two
delay-locked clock inputs can be separately configured to drive either noise
generation or measurement circuit. Pulses were generated by an Agilent
81134A generator exhibiting very low jitter (1.5 ps typical), and brought on-
chip via differential signaling. The repetition rate of the measurements is of
the order of 30MHz and essentially depends on the speed of the digital control
136 Measurement and evaluation of direct substrate noise
ph
o
to
di
o
de
 
co
n
n
e
ct
io
n
s
e
le
ct
ric
a
l c
o
n
n
e
ct
io
n
s
CL
K1
CL
K2
receiver
channel
circuits
noise
matrix 2
noise
matrix 1
noise
measurement 1
noise
measurement 2
multimode
glass fiber
12 channel driver IC
8x8 VCSEL
array
1mm
12 channel receiver IC
8x8 photodiode
array
Figure 5.7: The embedding of the noise generation and measurement circuits into the
receiver circuit
circuitry (the latch comparator is definitely not the speed-limiting factor and
could be used at frequencies well over 500MHz). Acquisition times are of the
order of 30 s for an entire percentile curve or a fixed-voltage cut. Note that
each point results from over 106 comparisons, which could safely be reduced.
A full characterization of m (t, v0) takes several hours at this level of precision.
A dominant factor in the total measurement time is the communication and
settling time of the pulse generator when the delay is changed.
Figure 5.8 shows the noise generation circuit in detail. It has a matrix-based
layout with 8 rows of 32 noise cells. The input clock of the noise generation
circuit propagates from row to row with an adjustable delay and drives all cells
within a row simultaneously. The delay is produced by current-starved invert-
ers, with separate control voltages for rising and falling transitions. A noise
cell consists of 30 digital inverters, each driving a capacitance to the substrate
of 15 fF. Every cell can be individually programmed to switch its inverters
with the clock, with the inverted clock, or not at all. This approach yields
a highly customizable noise source in terms of noise location, magnitude,
injected current direction and timing.
The connection of our substrate noise measurement circuit is shown in figure
5.9. The very low substrate voltage levels prohibit direct gate drive of n-
channel MOSFETs, as they would operate below threshold. Either a slower
p-channel input device is used [Nagata et al., 2000], or a capacitive coupling
with an n-channel device gate, biased above threshold, is applied [Donnay and
Gielen, 2003]. Following the latter approach, we use a simple RC high-pass
circuit (R1 = 158kΩ poly resistor, C1 = 20pF coupling capacitor) to couple
a substrate ‘probe’ contact to one latch comparator input. The lower -3-dB
cut-off frequency of 50 kHz is far below that of any substrate phenomena. The
5.6 Experimental Results 137
horizontal = simultaneous switchingv
e
rti
ca
l =
 a
dju
sta
bl
e
in
te
r-r
o
w
 d
el
ay
noise matrix
H
G
F
E
D
B
A
C
1 32
substrate
grounding
capacitor
substrate
sensing
capacitor m
ea
su
re
m
en
t
ci
rc
ui
t
15fFdrive
strength 2
cell clock input
can be inverted or disabled
30 timesnoise cell
substrate
1mm
Figure 5.8: The noise generation circuit
external Vref
R2
C2
substrate contact
R1
C1
clock
supply
VG2 VG3
Y
VG1
comparison
output
latch
comparator
Figure 5.9: Hooking up the latch comparator into the measurement circuit
138 Measurement and evaluation of direct substrate noise
coupling bandwidth with the substrate source depends on the gate capacitance
of transistor T2 and the equivalent resistivity of the substrate. Transistors
T2 and T3 were sized to minimize the sampling time. A -3-dB coupling
bandwidth of approximately 15GHz was observed through simulation, and
offers a good compromise with the 25-GHz intrinsic measurement bandwidth
of the latch comparator.
For the adjustable comparison voltage, a similar RC circuit is used, which
serves as a low-pass filter of an externally applied voltage against a local refer-
ence ground net. Strongly connected on-chip metal can be considered nearly
equipotential in the frequency range of ringing (50–500MHz). The on-chip
digital metal ground is a good ground net reference for our measurements, as
this choice yields nearly equal ground-bounce distortion at both latch inputs.
This way, it is guaranteed that only the fast substrate effects are measured
without being blurred by the effect of ringing.
5.6.2 Substrate noise evaluation
Figure 5.10(a) illustrates the kind of results obtained using the system in
low-noise conditions, allowing the use of the 50% percentile for waveform
estimation. The labels identifying the pulses correspond to the locations of
single noise cells in column 32 of the array as shown in figure 5.8. Cells were
fired individually to assess the impact of their distance to the measurement
circuit. This dependency is illustrated in figure 5.11, where it is clearly
observed that increased distance leads to less noise, as could be expected (cells
E32-H32). Additionally, a shielding effect can be observed. This is due to the
fact that the grounding capacitor is surrounded by a guard ring. This ring is
in between some of the noise cells and the substrate sensing capacitor (cells
A32-D32; see figure 5.8).
By simulation, predictions for the measured waveforms were obtained (figure
5.10(b)). The predicted substrate voltage waveforms are compared to effec-
tively observed waveforms. The precise peak values show differences of less
than ±10%, and observed pulse waveforms tend to be slightly wider than the
predicted ones. Overall, this level of correspondence is deemed very well in
view of a number of simplifications required to render the simulation compu-
tationally feasible. The most intensive task is the required RC reduction step,
which took several hours on a 64-bit machine with 4GiB of memory. Figure
5.12 illustrates the simplifications made. They include limiting the layout area
to be analyzed, joining physically separate but very closely related substrate
contacts, choosing not to model the substrate interaction of quiet or small
transistors, and indicating complex substrate contact patterns that may be
simplified to one large contact (by using the macro layer). We found that there
was a limit to these reductions beyond which simulation accuracy significantly
degrades—for instance, considering all substrate connections within one noise
cell as equipotential would have been too much of a simplification. Our noise
5.6 Experimental Results 139
-25
-20
-15
-10
-5
0
5
10
15
20
25
0 2 4 6 8 10 12 14 16
vo
lta
ge
 (m
V)
time (ns)
A B C D E F G H
A B C D E F G H
(a) Measurement
-25
-20
-15
-10
-5
0
5
10
15
20
25
0 2 4 6 8 10 12 14 16
vo
lta
ge
 (m
V)
time (ns)
A B C D E F G H
A B C D E F G H
(b) Simulation
Figure 5.10: A comparison of simulated and effectively measured direct substrate noise
waveforms. Note the very small width of the substrate noise pulses.
0
5
10
15
20
25
50 100 150 200 250
pe
ak
 v
ol
ta
ge
 (m
V)
distance from noise cell to substrate sensing capacitor (µm)
300
A32B32C32
D32
E32
F32
G32
H32
less noise with increased distance
shielded noise cells
(partly shielded)
Figure 5.11: Measured substrate noise peak voltages caused by switching noise cells
(noise matrix 1, column 32) in terms of their distance to the substrate sensing capacitor.
140 Measurement and evaluation of direct substrate noise
simplified
substrate
view
layout
view
Figure 5.12: An excerpt of the IC layout (see figure 5.7) illustrating the differences
between the full layout and the SAV being used for substrate coupling analysis. In
the SAV, black boxes represent substrate contacts, rectangles are n-wells and some
hatched areas (on the macro layer) indicate complex substrate contact patterns that may
be simplified to one large contact.
5.6 Experimental Results 141
measurement effectively assisted here to calibrate our substrate model in order
to achieve reliable results from the substrate analysis software.
As outlined previously, the measurement circuits were embedded alongside
photodiode receiver circuits. We expected that the direct impact of the noise
cells on the receiver circuits would be small, due to a thorough differential
design and a p+ guard ring absorbing most direct substrate noise. Neverthe-
less, we could notice receiver signal degradation dependent on the precise
location of the enabled noise cells. We have identified the innermost noise
cells in column 1 of both noise matrices to cause problems, regardless of the
location of the observed optical channel. This eliminated direct coupling of
noise with receiver circuits through the substrate as an explanation. Figures
5.13–5.14 show the reason: a very sensitive biasing node (to bias the photodi-
odes; see figure 2.21 on page 53) was brought on-chip using ordinary standard
cell library pads. The power ring was deliberately interrupted to eliminate
noise coupling through supply and ground nets. This, however, resulted in
floating substrate contacts in each pad’s protective structures. It provided an
unhindered capacitive coupling between the substrate and the pads through
the protective diodes. Nearby noise cells could therefore directly inject current
into the pads and hence into the receiver circuit inputs. The effects were
observed at the output of the receiver circuits. Simulations with our calibrated
substrate model have confirmed this coupling mechanism. The observed
sensitivity can be eliminated either by removing the protective diodes or by
adding an extra ground-connected guard ring. In summary, here, the noise
cells have provided a good diagnostic for a very real direct substrate noise
coupling threat that otherwise would have remained unnoticed.
Other than through the unprotected pads, no other direct substrate noise
coupling could be established. This is good news for OE-VLSI integration: it
confirms the effectiveness of the guard ring approach to absorb local substrate
disruptions, thus limiting the problem of substrate noise coupling to global
supply and ground bounce, which has to be taken into account anyway (yet
admittedly occurring with higher amplitude in an OE-VLSI setup).
5.6.3 Pulse reconstruction
An expanded view of positive pulse F of figure 5.10 is shown in figure 5.15(a).
As observed from the figure, the time resolution is excellent and allows
pulses as narrow as 200 ps to be measured accurately. This measurement was
done with a pulse generator with extremely low jitter. However, even in the
presence of a significant amount of jitter, our pulse reconstruction technique
of section 5.4 can accurately reconstruct major pulse characteristics with few
measurements, or even the full pulse waveform when m (t, v0) is measured
over the full range of t and v0. To demonstrate the pulse reconstruction
technique, we have deliberately added a sine-modulated jitter term in the
measurement clock (pk-pk amplitude of 100 ps) and redone the measurement.
142 Measurement and evaluation of direct substrate noise
analog
biasing
pad
analog
biasing
pad
power ring disconnection
protection
diodes
su
bs
tra
te
 
n
o
is
e
 
co
u
pl
in
g
protective ground
(only substrate)
biasing node
protection
diode
Figure 5.13: Noise cells are directly coupled with the substrate contacts of otherwise
unconnected protection diodes in the pad ring. Injected substrate noise was coupled
through the depletion capacitance of these diodes to a sensitive analog biasing net, the
oscillations of which could be observed (the high-frequency ringing of figure 5.1 is a
measured example). The shading of the noise cells represents their measured relative
individual contribution to this disturbance. The location of this piece of layout on the
IC is indicated on figure 5.14.
5.6 Experimental Results 143
RX1
RX2
RX3
RX4
RX5
RX6
RX7
RX8
RX9
RX10
RX11
RX12
preamp transimpedance amplifier limiting amp
1H
1G1F
1E
1D
1C1B
1A
1 32
1 32
2A
2H
2G
2F
2E
2D
2C
2B
MeasurementDark
maximal
output
signal
distortion
0mV
12.8mV
noise
measurement
2
noise
measurement
1
O
pt
ic
a
l s
id
e
Ele
ctrical
 sid
e
noise generation matrix 1
noise generation matrix 2
trigger clock 1
trigger clock 2
CLKN(2)
CLKP(2)
CLKN(1)
CLKP(1)
bias
bias
bias
bias
Figure 5.14: Overview of the peak voltage disturbance measured at the output of
receiver channel RX12 (non-illuminated) as caused by individual noise cells firing.
The coloring of the noise cells represents their relative individual contribution to this
disturbance. The hatched area is the piece of layout represented in more detail in
figure 5.13. The boxes around the receivers are outer guard rings. It is clear that only
the cells close to the indicated sensitive bias voltage bond pads contribute to the result.
144 Measurement and evaluation of direct substrate noise
jitter-free reference waveform
pulse width analysis without correction for additive noise
resulting p
fitted rhs of equation 34 around the peak
arabolic approximation of the pulse peak
0
5
10
15
20
-100 0 100 200
vo
lta
ge
 (m
V)
time (ps)
(a) Jitter-free reference measurement
0
5
10
15
20
-100 0 100 200
vo
lta
ge
 (m
V)
time (ps)
v1
v2
vm
90
70
50
30
10
17
18
19
20
-20 -10 0 10 20
vo
lta
ge
 (m
V)
time (ps)
v1
v2
(b) Jitter-polluted measurement (c) Detail of (b)
Figure 5.15: The accuracy of pulse width estimation. The estimated width at half-
maximum was obtained without additive-noise correction; the location of the peak was
determined by parabolic fit. Percentiles for raw measurement data are shown in thin
solid line. Note that the median (50% curve) is not a good estimate of the true pulse.
5.7 Conclusion 145
The resulting percentile curves are shown in figure 5.15(b) together with the
result of the waveform reconstruction from equations 5.21 and 5.22. As the
jitter-free measurement intrinsically has a (small) amount of additive noise,
we are now faced with extracting the true pulse shape from the measurements
with both noise components present. The figure shows how well the estimates
at half the pulse height vm fit the true value, which demonstrates that the error
term of equation 5.29 introduced by the additive noise was negligible. It is
also clearly shown that when trying to estimate the pulse peak this way, severe
errors are made. As shown in figure 5.15(b), naive application of equations
5.21 and 5.22 leads to an onion-shaped estimate, extending several mV above
the true value (a small multiple of σ). Furthermore, near the top, the width is
systematically underestimated as a result of curvature. To further improve the
accuracy in this jittered case, the parabolic fitting procedure of appendix D.4
has been applied. The result is shown in 5.15(c) and agrees very well with the
jitter-free measurement. This way, accurate measurements can be performed
even in situations with relatively high jitter.
5.7 Conclusion
We have presented an experimental environment to characterize the sensi-
tivity of embedded analog circuits—such as photodiode receiver circuits—to
digitally generated very localized substrate noise. The method is based on
equivalent-time sampling of periodic noise signals, using a simple latch com-
parator circuit. This latch circuit was demonstrated to have a large measure-
ment bandwidth. This way, the technique allows accurate on-chip measure-
ment of small but fast voltage changes without the need for high-bandwidth
analog output. On our 0.18–µm CMOS test chip, we have demonstrated
that our system allows to wave trace pulses as narrow as 200ps accurately.
Hence, it is capable to measure both the direct and indirect substrate noise
signals. The substrate was modeled for circuit-level simulation using Ca-
dence’s Substrate Noise Analyst and good correspondence between measured
and simulated substrate voltage waveforms was observed. Our measurement
circuit showed to be a suitable means to find the balance, with respect to the
allowable amount of substrate model simplification, between reliable results
and feasible computation times.
Additionally, a new method, not requiring any deconvolution operations, was
presented to extract accurate waveform properties from the raw measurement
data, even in the presence of significant timing jitter.
On the 0.18 µm receiver IC of the IO project, other than through some un-
guarded pads, no other direct substrate noise coupling could be established.
This is good news for OE-VLSI integration: it confirms the effectiveness of
guard ring protection to absorb local substrate disruptions, thus limiting the
problem of substrate noise coupling to global supply and ground bounce,
146 Measurement and evaluation of direct substrate noise
which has to be taken into account anyway (yet admittedly occurring with
higher amplitude in an OE-VLSI setup).
Chapter 6
Overall conclusions and
perspectives
Short-range parallel optical interconnect provides a solution to actual band-
width density problems concerning electrical interconnect at the (inter-)PCB
level. We have focused on the optoelectronic very large scale integration (OE-VLSI)
approach, in which optical access is provided right at the surface of CMOS
ICs. Two-dimensional arrays of lasers and photodetectors are flip-chip bonded
to the CMOS and physically interconnected using dense arrays of optical
waveguides. We have discussed the constituents of such a system, the actual
OE-VLSI realization in the Interconnect by Optics project, and references to
other work.
The direct integration of optical-electrical conversion provisions on an IC
bypasses the electrical PCB-to-IC interconnect, which can be very sophisticated
when a high bandwidth density is envisaged. However, the intricacies of this
chip access level are much better known and solution methodologies are far
more established in the electrical case than in the OE-VLSI case.
In this dissertation, we have focused on a number of concerns of a digital
system designer considering an OE-VLSI approach.
6.1 Design automation integration
The design of OE-VLSI systems consists of several steps, each needing some
form of electronic design automation (EDA) support. We have presented
a breakdown of fundamental design automation support—design creation,
simulation, extraction of properties and design space exploration—the method-
ologies involved and references to relevant realizations.
Circuit-level simulation models for the most important link components have
148 Overall conclusions and perspectives
been implemented in the mixed-signal language Verilog-AMS and integrated
into a simulation framework (which concretized to Cadence Virtuoso DFII).
The different simulation models can be chained together to simulate optical
interconnect. The intricacies concerning the integration of multidisciplinary—
electrical, optical, thermal—and sometimes badly conditioned differential
equations—e.g., laser rate equations—into a simulation system bearing the
marks of an electrical circuit dedication have been discussed as well.
6.2 Statistical modeling of uniformity
We have developed stochastic models for the inter-channel uniformity of
different subsystems of an OE-VLSI setup, in order to examine the unifor-
mity between complete channels. This modeling effort addresses laser drive
currents, VCSEL and photodiode process variations, VCSEL-fiber coupling
efficiencies caused by fiber misalignment, coupling efficiencies between a
VCSEL array and a multi-fiber connector, receiver circuit delays, and link bit
error ratios.
For quantitative results, we have examined the converter boards of the IO
project, a parallel optical inter-chip interconnect realization with 64 channels
at 1.25Gbps/channel. The most important signal amplitude nonuniformity
contribution can be attributed to variations of the fiber-coupled power at the
transmitting side, mainly due to VCSEL process variations, and to a lesser
extent caused by global multi-fiber connector misalignment. The alignment
at the receiving side is not nearly as critical given the standard fibers that
have been considered. Overall, the signal before the limiting amplifier of
any interconnect channel will suffer a combined process variation σall with a
standard deviation of 0.44 dB.
The principal application of the combined stochastic models was to make a
statement on the feasibility of using one dedicated clock channel as a clocking
or logic threshold basis for all other channels. When using source-synchronous
signaling—a common clocking basis for all channels—the substantial per-
channel clock synchronization circuitry reduces to one instance. If a common
logic threshold basis can be assumed, the requirement of dc-balanced data
coding is removed.
It turns out that, for the characteristics extracted from the IO project setup,
for any decent number of channels the usage of a common logic threshold is
infeasible. The bit error ratio is just too dependent on an accurate threshold.
Source-synchronous signaling should be feasible if the clock distribution skew
and the signal skew between channels are small enough, and the duration
of the acceptable sampling interval in the signal eye is sufficiently long. The
dependence of the channel delay on the optical modulation amplitude is
apparent (17.5 ps/dB). However, this turns out to be only a minor contributor
to the total inter-channel skew (σ ≈ 60ps) due to a sufficiently uniform optical
6.3 Substrate noise 149
modulation amplitude. Using source-synchronous signaling, the yield of an
array-wide BER < 10−12 is estimated at 56%. When the synchronous bitrate
is reduced to about 0.8GHz, this yield figure increases to 99.9%. Even though
our setup was not designed towards source-synchronous operation, our results
indicate that efficient source-synchronous parallel optical interconnect is well
in reach of well-crafted systems optimized for this mode of operation.
6.3 Substrate noise
We have researched the possible problem of noise coupling between sensitive
photodiode receiver circuit embedded in a hostile digital environment. We
have focused on direct noise coupling—attacking only part of a circuit—rather
than supply and ground bounce. To this end, dedicated noise generation and
measurement circuits have been implemented and characterized.
On the 0.18–µm CMOS receiver IC of the IO project, other than through
some unguarded pads, no different direct substrate noise coupling path could
be observed. This is good news for OE-VLSI integration: it confirms the
effectiveness of guard ring protection to absorb local substrate disruptions,
thus limiting the problem of substrate noise coupling to global supply and
ground bounce, which has to be taken into account anyway (yet admittedly
occurring with higher amplitude in an OE-VLSI setup).
6.4 Future work and outlook
The statistical modeling and characterization effort presented in this work has
enabled us to perform a number of interesting system-level analyses. Nev-
ertheless, the characterization has been necessarily limited to only a sample
of only one OE-VLSI realization. It would be interesting if a wide choice of
possible link components could be properly statistically characterized, so as
to allow the evaluation of different optical interconnect constitutions. The
modeling of the dynamic uniformity at the transmitter side deserves more at-
tention; this was not possible with the IO project hardware. Furthermore, one
should get closer to the bottom of the underlying fundamental mechanisms
of variability (in our case especially w.r.t. VCSELs). If the connection between
process variations of individual manufacturing steps and the resulting device
behavior would be clear, the prediction of the distribution of observable but
amalgamated quantities should be possible and could be used, instead of
having to rely on opaque measurements of these distributions.
Regarding interconnect modeling in the case of smaller-featured CMOS tech-
nologies and correspondingly higher signaling rates, dynamic effects (and
their uniformity) at different locations in the link can become significant and
should be taken into account. The relaxation oscillation frequency of VCSELs
150 Overall conclusions and perspectives
and the impact-related transit time in a larger photodiode come to mind; when
less ideal optical waveguides (than graded-index fibers) are considered—such
as PCB-embedded waveguides—dispersion can become significant as well.
The modeling and design automation support for PCB-embedded waveguides
is an exercise remaining for when the technology becomes more univocal and
mature.
A final word on the commercial introduction of inter-chip optical interconnect,
as looked upon from a personal perspective. At present, the physical assem-
bly methods of all experimentally validated inter-chip optical interconnect
solutions are just too sophisticated or demanding w.r.t. mechanical accuracy
to be commercially viable when compared to electrical interconnect solutions.
In the past few years there has been a trend among research teams to focus on
even shorter link lengths, with major research efforts concentrating on on-chip
optical interconnect. Although there certainly are benefits associated with
on-chip optical interconnect, the need for a more practical high-density optical
transition between an IC and the outside world remains. Wishful thinking
leads to a candidate envisaged ‘perfect solution’ where optical waveguides
can be seamlessly integrated into a general printed circuit board in multiple
layers (with inter-layer vias), and where surface-mountable hermetically sealed
OE-VLSI chip packages adhere to the same mounting alignment requirements
as purely electrical packages. Although several partial solutions exist (we
refer to Schares et al. [2006] for an overview), optical inter-chip interconnect
will not emerge in practice before a practical all-embracing solution has been
validated.
At the side of demands, the driving force towards higher bandwidth densities
is currently very much present with the move towards an ever increasing
number of processor cores even in common personal computers. The in-
terconnect bandwidth between different processors is the bottleneck for the
execution speed of strongly connected distributed algorithms and will need to
be addressed in due time. In our opinion, highly parallel optical interconnect
provides a nice escape route from bandwidth density problems of electrical
interconnect. Looking forward to the materialization of huge interconnect
demands in the coming years, we hope for the contributions of this thesis to
carry through to the interconnect solutions of the future.
References
AnandTech. Quad Core Intel Xeon 53xx Clovertown [online]. (2006). Available from:
http://www.anandtech.com/IT/showdoc.aspx?i=2897.
Ankiewicz, A. and Pask, C. (1977). Geometric optics approach to light acceptance and
propagation in graded index fibres. Optical and Quantum Electronics, 9(2):87–109.
doi:10.1007/BF00619890.
Annen, R. (2002). CMOS VCSEL Drivers for Optical Interchip Communication. PhD thesis,
Swiss Federal Institute of Technology Zürich, Dissertation No. 14081. Available from:
http://e-collection.ethbib.ethz.ch/show?type=diss&nr=14081.
Annen, R. and Melchior, H. (2002). Fully integrated 0.25 µm CMOS VCSEL driver with
current peaking. Electronics Letters, 38(4):174–175. doi:10.1049/el:20020123.
ANSI/TIA/EIA-644-A: Electrical characteristics of low voltage differential signaling
(LVDS) interface circuits [online]. (2001). Available from: http://www.tiaonline.
org/standards/catalog/details.cfm?document_id=1747.
Arrue, J., Zubia, J., Durana, G., and Mateo, J. (2001). Parameters affecting bending
losses in graded-index polymer optical fibers. IEEE Journal of Selected Topics in
Quantum Electronics, 7(5):836–844. doi:10.1109/2944.979345.
Badaroglu, M., Van der Plas, G., Wambacq, P., Balasubramanian, L., Tiri, K., Ver-
bauwhede, I., Donnay, S., Gielen, G. G. E., and De Man, H. J. (2004). Dig-
ital circuit capacitance and switching analysis for ground bounce in ICs with
a high-ohmic substrate. IEEE Journal of Solid-State Circuits, 39(7):1119–1130.
doi:10.1109/JSSC.2004.829393.
Badaroglu, M., Wambacq, P., Van der Plas, G., Donnay, S., Gielen, G. G. E., and De Man,
H. J. (2006). Evolution of substrate noise generation mechanisms with CMOS
technology scaling. IEEE Transactions on Circuits and Systems—Part I: Fundamental
Theory and Applications, 53(2):296–305. doi:10.1109/TCSI.2005.856049.
Barrett, R. M. and Barnes, M. H. (1951). Microwave printed circuits. Radio and T.V.
News, 46:16.
Bastiaans, M. J. and Alieva, T. (2006). Propagation law for Hermite- and Laguerre-
Gaussian beams in first-order optical systems. In Sheng, Y., Zhuang, S., and Zhang,
Y., editors, ICO-20: Optical Information Processing, volume 6027 of Proceedings of SPIE.
article 602711, 6p., ISBN: 978-0-8194-6058-5.
152 References
Baukens, V. (2001). Scalable micro-optical modules for short distance photonic-VLSI intercon-
nections. PhD thesis, Vrije Universiteit Brussel, Belgium.
Baukens, V., Verschaffelt, G., Tuteleers, P., Vynck, P., Ottevaere, H., Kufner, M., Kufner,
S., Veretennicoff, I., Bockstaele, R., Van Hove, A., Dhoedt, B., Baets, R., and Thienpont,
H. (1999). Performances of optical multi-chip-module interconnects: comparing
guided-wave and free-space pathways. Journal of the European Optical Society Part A,
1(2):255–261. doi:10.1088/1464-4258/1/2/027.
Beaudoin, M., DeVries, A. J. G., Johnson, S. R., Laman, H., and Tiedje, T. (1997). Optical
absorption edge of semi-insulating GaAs and InP at high temperatures. Applied
Physics Letters, 70(26):3540–3542. doi:10.1063/1.119226.
Belden Cable: Catalog—coaxial cables—50 Ohm transmission cable—electrical char-
acteristics of 9913, 9913F7 and 9914 [online]. (2006). Available from: http://www.
belden.com/pdfs/03Belden_Master_Catalog/06Coaxial_Cables/06.67_71.pdf.
Bell, A. G. (1881). Telephone circuit. U.S. Patent 244 426. Available from: http:
//www.pat2pdf.org/pat2pdf/foo.pl?number=244426.
Benner, A. F., Ignatowski, M., Kash, J. A., Kuchta, D. M., and Ritter, M. B. (2005).
Exploitation of optical interconnects in future server architectures. IBM Journal of
Research and Development, 49(4/5):755–775. doi:10.1147/rd.494.0755.
Berger, C., Kossel, M. A., Menolfi, C., Morf, T., Toifl, T., and Schmatz, M. L. (2003).
High-density optical interconnects within large-scale systems. In Thienpont, H. and
Danckaert, J., editors, VCSELs and Optical Interconnects, volume 4942 of Proceedings of
SPIE, pages 222–235. ISBN: 978-0-8194-4737-1. doi:10.1117/12.468494.
Bewtra, N., Suda, D. A., Tan, G. L., Chatenoud, F., and Xu, J. M. (1995). Modeling of
quantum-well lasers with electro-opto-thermal interaction. IEEE Journal of Selected
Topics in Quantum Electronics, 1(2):331–340. doi:10.1109/2944.401212.
Bhatti, J. (2006). Analyse en ontwerp van aanstuur- en ontvangstcircuits voor geïnte-
greerde optische verbindingen (Analysis and design of driver and receiver circuits
for integrated optical interconnect). Master’s thesis, Ghent University. Avail-
able from: http://aleph.ugent.be/F/?func=direct&doc_library=RUG01&doc_
number=001030676.
Bienstman, P., Baets, R., Vukusic, J., Larsson, A., Noble, M. J., Brunner, M., Gulden, K.,
Debernardi, P., Fratta, L., Bava, G. P., Wenzel, H., Klein, B., Conradi, O., Pregla, R.,
Riyopoulos, S. A., Seurin, J.-F. P., and Chuang, S. L. (2001). Comparison of optical
vcsel models on the simulation of oxide-confined devices. IEEE Journal of Quantum
Electronics, 37(12):1618–1631. doi:10.1109/3.970909.
Blackshear, E. D., Cases, M., Klink, E., Engle, S. R., Malfatt, R. S., de Araujo, D. N.,
Oggioni, S., LaCroix, L. D., Wakil, J. A., Hougham, G. G., Pham, N. H., and Russell,
D. J. (2005). The evolution of build-up package technology and its design challenges.
IBM Journal of Research and Development, 49(4/5):641–661. doi:10.1147/rd.494.0641.
Bockstaele, R. (2001). Resonant Cavity Light Emitting Diode Based Parallel Optical
Interconnects. PhD thesis, Ghent University, Belgium. Available from: http:
//photonics.intec.ugent.be/publications/PhD.asp?ID=146.
References 153
Bockstaele, R., Coosemans, T., Sys, C., Vanwassenhove, L., Van Hove, A., Dhoedt, B.,
Moerman, I., Van Daele, P., Baets, R. G., Annen, R., Melchior, H., Hall, J., Heremans,
P. L., Brunfaut, M., and Van Campenhout, J. (1999). Realization and characterization
of 8× 8 resonant cavity LED arrays mounted onto CMOS drivers for POF-based
interchip interconnections. IEEE Journal of Selected Topics in Quantum Electronics,
5(2):224–235. doi:10.1109/2944.778294.
Bockstaele, R., De Wilde, M., Meeus, W., Rits, O., Lambrecht, H., Van Campenhout,
J., De Baets, J., Van Daele, P., van den Berg, E., Clemenc, M., Eitel, S., Annen, R.,
Van Koetsem, J., Widawski, G., Goudeau, J., Bareel, B., Le Moine, P., Fries, R.,
Straub, P., and Baets, R. (2004a). A parallel optical interconnect link with on-chip
optical access. In Thienpont, H., Choquette, K. D., and Taghizadeh, M. R., editors,
Micro-Optics, VCSELs, and Photonic Interconnects, volume 5453 of Proceedings of SPIE,
pages 124–133. ISBN: 978-0-8194-5376-1. doi:10.1117/12.545490.
Bockstaele, R., De Wilde, M., Meeus, W., Sergeant, H., Rits, O., Van Campenhout,
J., De Baets, J., Van Daele, P., Dorgeuille, F., Eitel, S., Klemenc, M., Annen, R.,
Van Koetsem, J., Goudeau, J., Bareel, B., Fries, R., Straub, P., Marion, F., Routin,
J., and Baets, R. (2004b). Chip-to-chip parallel optical interconnects over optical
backpanels based on arrays of multimode waveguides. In Proceedings of the 9th
Annual Symposium of the IEEE Lasers and Electro-Optics Society (LEOS) Benelux Chapter,
pages 61–64. ISBN: 978-90-76546-06-3. Available from: http://leosbenelux.org/
symp04/s04p061.pdf.
Bode, H. W. (1937). Attenuation equalizer. U.S. Patent 2 096 027. Available from:
http://www.pat2pdf.org/pat2pdf/foo.pl?number=2096027.
Box, G. E. P. and Muller, M. E. (1958). A note on the generation of random normal
deviates. The Annals of Mathematical Statistics, 29(2):610–611.
Boyd, G. D., Miller, D. A. B., Chemla, D. S., McCall, S. L., Gossard, A. C., and English,
J. H. (1987). Multiple quantum well reflection modulator. Applied Physics Letters,
50(17):1119–1121. doi:10.1063/1.97935.
Brunfaut, M., Meeus, W., Dambre, J., Bockstaele, R., Van Campenhout, J., Melchior,
H., Hall, J., Neyer, A., Heremans, P., Van Koetsem, J., King, R., Thienpont, H.,
Vanwassenhove, L., and Baets, R. (2001). Demonstrating POF based optoelectronic
interconnect in a multi-FPGA prototype system. In Proceedings of the 27th European
Conference on Optical Communication (ECOC 2001), volume 3, pages 298–299. ISBN:
978-0-7803-6705-0. doi:10.1109/ECOC.2001.989636.
Bui Viet, K. (2004). Opportunities for reconfigurable optical interconnects in parallel computer
architectures. PhD thesis, Vrije Universiteit Brussel, Belgium.
Cadence Substrate Noise Analyst [online]. (2004). Available from: http://www.
cadence.com/.
Cadence Virtuoso Custom Design Platform [online]. (introduced in 1987). Available
from: http://www.cadence.com/products/custom_ic/.
Casper, B., Martin, A., Jaussi, J. E., Kennedy, J., and Mooney, R. (2003). An 8-Gb/s
simultaneous bidirectional link with on-die waveform capture. IEEE Journal of
Solid-State Circuits, 38(12):2111–2120. doi:10.1109/JSSC.2003.818569.
154 References
Cho, H., Kapur, P., and Saraswat, K. C. (2004a). Power comparison between high-speed
electrical and optical interconnects for interchip communication. IEEE/OSA Journal
of Lightwave Technology, 22(9):2021–2033. doi:10.1109/JLT.2004.833531.
Cho, H., Kapur, P., and Saraswat, K. C. (2006). Performance comparison be-
tween vertical-cavity surface-emitting laser and quantum-well modulator for
short-distance optical links. IEEE Photonics Technology Letters, 18(3):520–522.
doi:10.1109/LPT.2005.863986.
Cho, I.-K., Yoon, K. B., Ahn, S. H., Jeong, M. Y., Sung, H.-K., Lee, B. H.,
Heo, Y. U., and Park, H.-H. (2004b). Board-to-board optical interconnection
system using optical slots. IEEE Photonics Technology Letters, 16(7):1754–1756.
doi:10.1109/LPT.2004.828833.
Cho, M. H., Hwang, S. H., Cho, H. S., and Park, H.-H. (2005). High-coupling-
efficiency optical interconnection using a 90°-bent fiber array connector in op-
tical printed circuit boards. IEEE Photonics Technology Letters, 17(3):690–692.
doi:10.1109/LPT.2004.840916.
Cho, T. B. and Gray, P. R. (1995). A 10 b, 20Msample/s, 35mW pipeline A/D converter.
IEEE Journal of Solid-State Circuits, 30(3):166–172. doi:10.1109/4.364429.
Coakley, K. J., Wang, C.-M., Hale, P. D., and Clement, T. S. (2003). Adaptive char-
acterization of jitter noise in sampled high-speed signals. IEEE Transactions on
Instrumentation and Measurement, 52(5):1537–1547. doi:10.1109/TIM.2003.817905.
Collet, J. H., Litaize, D., Van Campenhout, J., Jesshope, C., Desmulliez, M., Thienpont,
H., Goodman, J., and Louri, A. (2000). Architectural approach to the role of optics
in monoprocessor and multiprocessor machines. Applied Optics, 39(5):671–682.
Coosemans, T. (2006). Development of a Connector Technology for High-Density Parallel
Interconnects Based on Plastic Optical Fibres. PhD thesis, Ghent University, Belgium.
Available from: http://photonics.intec.ugent.be/publications/phd.asp?ID=
155.
Corning Optical Fiber: technical library [online]. (last checked in 2007). Available from:
http://www.corning.com/opticalfiber/technical_library/.
Dally, W. J. and Poulton, J. (1997). Transmitter equalization for 4–Gbps signaling. IEEE
Micro, 17(1):48–56. doi:10.1109/40.566199.
De Clerck, S. (2004). Karakterisering en modellering van multimode VCSELs voor
simulatie op circuitniveau (Characterisation and modeling of multimode VCSELs for
circuit-level simulation). Master’s thesis, Ghent University. Available from: http://
aleph.ugent.be/F/?func=direct&doc_library=RUG01&doc_number=000820552.
De Maeyer, J., Rombouts, P., and Weyten, L. (2004). A double-sampling
extended-counting ADC. IEEE Journal of Solid-State Circuits, 39(3):411–418.
doi:10.1109/JSSC.2003.822903.
De Wilde, M., Dambre, J., and Stroobandt, D. (2001). A flexible tree-based platform
for design space exploration of hierarchical FPGA architectures. In Second Faculty
of Engineering (FirW) PhD Symposium. Ghent University. Available from: http:
//escher.elis.ugent.be/publ/Edocs/DOC/P101_151.pdf.
References 155
De Wilde, M., Meeus, W., Rombouts, P., and Van Campenhout, J. (2006a). A simple
on-chip repetitive sampling setup for the quantification of substrate noise. IEEE
Journal of Solid-State Circuits, 41(5):1062–1072. Available from: http://escher.elis.
ugent.be/publ/Edocs/DOC/P106_052.pdf, doi:10.1109/JSSC.2006.872873.
De Wilde, M., Meeus, W., and Van Campenhout, J. (2006b). Analysis of localized
high-frequency substrate noise. In Proceedings of the CDNLive! Silicon Valley 2006
Conference, volume PNP, San Jose, CA, USA. Available from: http://www.cdnusers.
org/Portals/0/cdnlive/na2006/PNP/PNP_257/257_paper.pdf.
De Wilde, M. and Rits, O. (2003). Design methodology development for guided-wave
optical interconnects. In European Design and Automation Association (EDAA) Ph.D.
Forum at the 2003 Design, Automation and Test in Europe (DATE 2003) Conference,
Munich, Germany. Available from: http://escher.elis.ugent.be/publ/Edocs/
DOC/P103_022.pdf.
De Wilde, M., Rits, O., Baets, R., and Van Campenhout, J. (2007). Synchronous parallel
optical I/O on CMOS: a case study of the uniformity issue. IEEE/OSA Journal of
Lightwave Technology. In review.
De Wilde, M., Rits, O., Bockstaele, R., Baets, R., and Van Campenhout, J. (2003a). Design
methodology development for VCSEL-based guided-wave optical interconnects. In
Proceedings of the 7th IEEE Workshop on Signal Propagation on Interconnects (SPI 2003),
pages 137–138, Siena, Italy. Available from: http://escher.elis.ugent.be/publ/
Edocs/DOC/P103_028.pdf.
De Wilde, M., Rits, O., Bockstaele, R., Van Campenhout, J. M., and Baets, R. G. (2003b).
A circuit-level simulation approach to analyze system-level behavior of VCSEL-
based optical interconnects. In Thienpont, H. and Danckaert, J., editors, VCSELs
and Optical Interconnects, volume 4942 of Proceedings of SPIE, pages 247–257. ISBN:
978-0-8194-4737-1. Available from: http://escher.elis.ugent.be/publ/Edocs/
DOC/P103_021.pdf, doi:10.1117/12.471929.
De Wilde, M., Rits, O., Meeus, W., Lambrecht, H., and Van Campenhout, J. (2004).
Integration of modeling tools for parallel optical interconnects in a standard EDA
design environment. In Workshop on Parallel Optical Interconnects Inside Electronic
Systems at the 2004 Design, Automation and Test in Europe (DATE 2004) Conference,
Paris, France. Available from: http://escher.elis.ugent.be/publ/Edocs/DOC/
P104_072.pdf.
De Wilde, M., Stroobandt, D., and Van Campenhout, J. (2002). AQUASUN: adaptive
window query processing in CAD applications for physical design and verification.
In Proceedings of the 12th ACM Great Lakes symposium on VLSI (GLSVLSI 2002), pages
153–159, New York, USA. ISBN: 978-1-58113-462-9. Available from: http://escher.
elis.ugent.be/publ/Edocs/DOC/P102_007.pdf, doi:10.1145/505306.505339.
Debaes, C. (2003). Optical Intra-Multi-Chip-Module Interconnects. PhD thesis, Vrije
Universiteit Brussel, Belgium.
Debaes, C., Van Erps, J., Vervaeke, M., Volckaerts, B., Ottevaere, H., Gomez,
V., Vynck, P., Desmet, L., Krajewski, R., Ishii, Y., Hermanne, A., and Thien-
pont, H. (2006). Deep proton writing: a rapid prototyping polymer micro-
156 References
fabrication tool for micro-optical modules. New Journal of Physics, 8(270):1–18.
doi:doi:10.1088/1367-2630/8/11/270.
Debaes, C., Vervaeke, M., Baukens, V., Ottevaere, H., Vynck, P., Tuteleers, P., Vol-
ckaerts, B., Meeus, W., Brunfaut, M., Van Campenhout, J., Hermanne, A., and
Thienpont, H. (2003). Low-cost microoptical modules for MCM level optical in-
terconnections. IEEE Journal of Selected Topics in Quantum Electronics, 9(2):518–530.
doi:10.1109/JSTQE.2003.813316.
DeLange, O. E. (1970). Wide-band optical communication systems: Part II—frequency-
division multiplexing. Proceedings of the IEEE, 58(10):1683–1690.
Donnay, S. and Gielen, G., editors (2003). Substrate Noise Coupling in Mixed-Signal ASICs.
Kluwer Academic Publishers, Dordrecht, The Netherlands. ISBN: 978-1-4020-7381-6.
Emami-Neyestanak, A. (2004). Design of CMOS Receivers for Parallel Optical Interconnects.
PhD thesis, Stanford University, CA, USA. Available from: http://www-vlsi.
stanford.edu/papers/aen_thesis.pdf.
Espenschied, L. and Affel, H. A. (1931). Concentric conducting system. U.S. Patent
1 835 031. Available from: http://www.pat2pdf.org/pat2pdf/foo.pl?number=
1835031.
Esper-Chain, R., Tobajas, F., Tubio, O., Arteaga, R., de Armas, V., and Sarmiento, R.
(2005). A gigabit multidrop serial backplane for high-speed digital systems based
on asymmetrical power splitter. IEEE Transactions on Circuits and Systems—Part II:
Analog and Digital Signal Processing, 52(1):5–9. doi:10.1109/TCSII.2004.838661.
FCI Airmax VS [online]. (2003). Available from: http://www.fciconnect.com/
electrical-connector/airmax-connector.htm.
Gans, W. L. (1983). The measurement and deconvolution of time jitter in equivalent-
time waveform samplers. IEEE Transactions on Instrumentation and Measurement,
32(1):126–133.
Gase, R. (1995). Representation of Laguerre-Gaussian modes by the Wigner
distribution function. IEEE Journal of Quantum Electronics, 31(10):1811–1818.
doi:10.1109/3.466056.
Gerling, J. (2005). Simulation optischer Multimode-Wellenleiter im Zeitbereich. PhD thesis,
Universität Paderborn, Shaker Verlag, Aachen, Germany. ISBN: 978-3-8322-3836-0.
Ghatak, A., Sharma, E., and Kompella, J. (1988). Exact ray paths in bent waveguides.
Applied Optics, 27(15):3180–3184.
Gholami, A., Toffano, Z., Destrez, A., Pellevrault, S., Pez, M., and Quentel, F.
(2006a). Optimization of VCSEL spatiotemporal operation in MMF links for 10-
Gb ethernet. IEEE Journal of Selected Topics in Quantum Electronics, 12(4):767–775.
doi:10.1109/JSTQE.2006.876330.
Gholami, A., Toffano, Z., Destrez, A., Pez, M., and Quentel, F. (2006b). Spatiotemporal
and thermal analysis of VCSEL for short-range gigabit optical links. Optical and
Quantum Electronics, 38(4–6):479–493. doi:10.1007/s11082-006-0044-3.
References 157
Goodman, J. W., Leonberger, F. J., Kung, S.-Y., and Athale, R. A. (1984). Optical
interconnections for VLSI systems. Proceedings of the IEEE, 72(7):850–866.
Goodwin, M. J., Moseley, A. J., Kearly, M. Q., Morris, R. C., Kirkby, C. J. G., Thompson,
J., Goodfellow, R. C., and Bennion, I. (1991). Optoelectronic component arrays for
optical interconnection of circuits and subsystems. IEEE/OSA Journal of Lightwave
Technology, 9(12):1639–1645. doi:10.1109/50.108708.
Grieg, D. D. and Engelmann, H. F. (1952). Microstrip—a new transmission tech-
nique for the kilomegacycle range. Proceedings of the IRE, 40(12):1644–1650.
doi:10.1109/JRPROC.1952.274144.
Gui, P., Kiamilev, F. E., Wang, X., MacFadden, M. J., Wang, X., Waite, N., Haney,
M. W., and Kuznia, C. (2005). A source-synchronous double-data-rate parallel
optical transceiver IC. IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
13(7):833–842. doi:10.1109/TVLSI.2005.850101.
Gupta, A., Weber, W.-D., and Mowry, T. (1990). Reducing memory and traffic re-
quirements for scalable directory-based cache coherence schemes. In Proceedings of
the 1990 International Conference on Parallel Processing (ICPP 1990), volume I, pages
312–321. Pennsylvania State University Press. ISBN: 978-0-271-00728-1.
Hall, S. H., Hall, G. W., and McCall, J. A. (2000). High-Speed Digital System Design: A
Handbook of Interconnect Theory and Design Practices. Wiley-IEEE Press, New York,
NY, USA, first edition. ISBN: 978-0-471-36090-2.
Heaviside, O. (1885–1887). Electromagnetic induction and its propagation. The Electri-
cian.
Heinrich, J., Zeeb, E., and Ebeling, K. J. (1997). Butt-coupling efficiency of VC-
SELs into multimode fibers. IEEE Photonics Technology Letters, 9(12):1555–1557.
doi:10.1109/68.643258.
Helix AG semiconductors [online]. (founded in 1990). Available from: http://www.
helix.ch/.
Ho, R., Amrutur, B., Mai, K., Wilburn, B., Mori, T., and Horowitz, M. (1998). Ap-
plications of on-chip samplers for test and measurement of integrated circuits. In
Digest of Technical Papers of the 1998 Symposium on VLSI Circuits, pages 138–139. ISBN:
978-0-7803-4766-3. doi:10.1109/VLSIC.1998.688033.
HyperTransport [online]. (first specification in 2001). Available from: http://www.
hypertransport.org/.
IEC 61754-5: International Electrotechnical Commission Standard 61754-5: Fibre
optic connector interfaces—Part 5: Type MT connector family [online]. (2005).
Available from: http://webstore.iec.ch/webstore/webstore.nsf/Standards/
IEC%2061754-5.
IEC 61754-7: International Electrotechnical Commission Standard 61754-7: Fibre op-
tic connector interfaces—Part 7: Type MPO connector family [online]. (2004).
Available from: http://webstore.iec.ch/webstore/webstore.nsf/Standards/
IEC%2061754-7.
158 References
IEEE Std 1149.1: IEEE standard test access port and boundary-scan architecture [online].
(2001). Available from: http://ieeexplore.ieee.org/servlet/opac?punumber=
7481. ISBN: 978-0-7381-2944-0.
IEEE Std 1149.6: IEEE standard for boundary-scan testing of advanced digital networks
[online]. (2003). Available from: http://ieeexplore.ieee.org/servlet/opac?
punumber=8513. ISBN: 978-0-7381-3576-2.
Iga, K., Ishikawa, S., Ohkouchi, S., and Nishimura, T. (1984). Room-temperature pulsed
oscillation of GaAlAs/GaAs surface emitting injection laser. Applied Physics Letters,
45(4):348–350. doi:10.1063/1.95265.
InfiniBand [online]. (first specification in 2000). Available from: http://www.
infinibandta.org/.
IO—European Commision FP5 Information Society Technologies Programme contract
IST-2000-28358: Interconnect by optics [online]. (2001–2005). Available from: http:
//io.intec.ugent.be/.
Ishii, Y., Koike, S., Arai, Y., and Ando, Y. (2003). SMT-compatible large-tolerance
“optobump” interface for interchip optical interconnections. IEEE Transactions on
Advanced Packaging, 26(2):122–127. doi:10.1109/TADVP.2003.817332.
ISO/IEC 7498-1 standard: Information technology – Open Systems Interconnection –
basic reference model: The basic model [online]. (Second edition 1994). Available
from: http://standards.iso.org/ittf/PubliclyAvailableStandards/s020269_
ISO_IEC_7498-1_1994(E).zip.
ITRS: International Technology Roadmap for Semiconductors, 2006 update, intercon-
nect [online]. (2006). Available from: http://www.itrs.net/Links/2006Update/
FinalToPost/09_Interconnect2006Update.pdf.
ITRS: International Technology Roadmap for Semiconductors, 2006 update, overview
and working group summaries [online]. (2006). Available from: http://www.itrs.
net/Links/2006Update/FinalToPost/00_ExecSum2006Update.pdf.
ITRS: International Technology Roadmap for Semiconductors, 2006 update, assem-
bly & packaging [online]. (2006). Available from: http://www.itrs.net/Links/
2006Update/FinalToPost/11_AP2006UPDATE.pdf.
Jahns, J., Morgan, R. A., Nguyen, H. N., Walker, J. A., Walker, S. J., and Wong,
Y. M. (1992). Hybrid integration of surface-emitting microlaser chip and planar
optics substrate for interconnection applications. IEEE Photonics Technology Letters,
4(12):1369–1372. doi:10.1109/68.180579.
Jöhnck, M., Wittmann, B., and Neyer, A. (1998). 64-channel two-dimensional POF-
based optical array interchip interconnect. In Chavel, P. H., Miller, D. A. B., and
Thienpont, H., editors, Optics in Computing ’98, volume 3490 of Proceedings of SPIE,
pages 285–288. ISBN: 978-0-8194-2949-0. doi:10.1117/12.308944.
John, S. (1987). Strong localization of photons in certain disordered dielectric superlat-
tices. Physical Review Letters, 58(23):2486–2489. doi:10.1103/PhysRevLett.58.2486.
References 159
Jou, J.-J., Liu, C.-K., Hsiao, C.-M., Lin, H.-H., and Lee, H.-C. (2002). Time-delay
circuit model of high-speed p-i-n photodiodes. IEEE Photonics Technology Letters,
14(4):525–527. doi:10.1109/68.992599.
Jungo, M. (2003). Spatiotemporal VCSEL Model for Advanced Simulations of Optical
Links, volume 30 of Series in Quantum Electronics. Hartung-Gorre Verlag, Konstanz,
Germany, first edition. ISBN: 978-3-89649-831-1.
Juniper Networks TX Matrix Platform: Hardware components and cable system [on-
line]. (2004). Available from: http://www.juniper.net/solutions/literature/
white_papers/200099.pdf.
Kanjamala, A. P. and Levi, A. F. J. (1995). Subpicosecond skew in multimode fibre
ribbon for synchronous data transmission. Electronics Letters, 31(16):1376–1377.
doi:10.1049/el:19950912.
Keck, D. B., Maurer, R. D., and Schultz, P. C. (1973). On the ultimate lower limit
of attenuation in glass optical waveguides. Applied Physics Letters, 22(7):307–309.
doi:10.1063/1.1654649.
Keiner, L. E. Physical Oceanography Animations, Coastal Carolina University [online].
(2007). Available from: http://kingfish.coastal.edu/marine/Animations/.
Kerns, K. J. and Yang, A. T. (1997). Stable and efficient reduction of large, multi-
port RC networks by pole analysis via congruence transformations. IEEE Trans-
actions on Computer-Aided Design of Integrated Circuits and Systems, 16(7):734–744.
doi:10.1109/43.644034.
King, R., Michalzik, R., Jung, C., Grabherr, M., Eberhard, F., Jäger, R., Schnitzer, P., and
Ebeling, K. J. (1998). Oxide-confined 2D VCSEL arrays for high-density inter/intra-
chip interconnects. In Choquette, K. D. and Morgan, R. A., editors, Vertical-Cavity
Surface-Emitting Lasers II, volume 3286 of Proceedings of SPIE, pages 64–71. ISBN:
978-0-8194-2725-0. doi:10.1117/12.305445.
Kirchhoff, G. (1845). Ueber den Durchgang eines elektrischen Stromes durch eine
Ebene, insbesondere durch eine kreisförmige. Poggendorff’s Annalen (Annalen der
Physik und Chemie), LXIV:497–514.
Koyama, F., Kinoshita, S., and Iga, K. (1989). Room-temperature continuous wave
lasing characteristics of a GaAs vertical cavity surface-emitting laser. Applied Physics
Letters, 55(3):221–222. doi:10.1063/1.101913.
Krishnamoorthy, A. V. and Goossen, K. W. (1998). Optoelectronic-VLSI: photonics
integrated with VLSI circuits. IEEE Journal of Selected Topics in Quantum Electronics,
4(6):899–912. doi:10.1109/2944.736073.
Kuo, Y.-H., Lee, Y. K., Ge, Y., Ren, S., Roth, J. E., Kamins, T. I., Miller, D. A. B., and
Harris, J. S. (2005). Strong quantum-confined Stark effect in germanium quantum-
well structures on silicon. Nature, 437:1334–1336. doi:10.1038/nature04204.
Kuo, Y.-H., Lee, Y. K., Ge, Y., Ren, S., Roth, J. E., Kamins, T. I., Miller, D. A. B., and
Harris, Jr., J. S. (2006). Quantum-confined Stark effect in Ge/SiGe quantum wells
on Si for optical modulators. IEEE Journal of Selected Topics in Quantum Electronics,
12(6):1503–1513. doi:10.1109/JSTQE.2006.883146.
160 References
Li, H. and Iga, K., editors (2002). Vertical-Cavity Surface-Emitting Laser Devices, volume 6
of Springer Series in Photonics. Springer, Berlin, Germany. ISBN: 978-3-540-67851-9.
Liu, J. J., Kalayjian, Z., Riely, B., Chang, W., Simonis, G. J., Apsel, A., and
Andreou, A. (2003). Multichannel ultrathin silicon-on-sapphire optical inter-
connects. IEEE Journal of Selected Topics in Quantum Electronics, 9(2):380–386.
doi:10.1109/JSTQE.2003.814182.
Loke, M.-Y. and McMullin, J. N. (1990). Simulation and measurement of radiation
loss at multimode fiber macrobends. IEEE/OSA Journal of Lightwave Technology,
8(8):1250–1256. doi:10.1109/50.57848.
Lucy, L. B. (1974). An iterative technique for the rectification of observed distributions.
Astronomical Journal, 79(6):745–754. doi:10.1086/111605.
Lyon nanotechnology institute [online]. (2007). Available from: http://inl.ec-lyon.
fr/.
Makie-Fukuda, K., Anbo, T., Tsukada, T., Matsuura, T., and Hotta, M. (1996). Voltage-
comparator-based measurement of equivalently sampled substrate noise waveforms
in mixed-signal integrated circuits. IEEE Journal of Solid-State Circuits, 31(5):726–731.
doi:10.1109/4.509856.
Makino, K., Nakamura, T., Ishigure, T., and Koike, Y. (2005). Analysis of graded-index
polymer optical fiber link performance under fiber bending. IEEE/OSA Journal of
Lightwave Technology, 23(6):2062–2072. doi:10.1109/JLT.2005.849886.
Marion, F., Routin, J., Bockstaele, R., and Rits, O. (2004). Packaging for optical Rx/Tx
arrays: From BGA to CSP packaging. In Proceedings of the 37th International Symposium
on Microelectronics (IMAPS 2004), Long Beach, CA, USA. Poster presentation.
Matsumoto, M. and Nishimura, T. (1998). Mersenne twister: A 623-dimensionally
equidistributed uniform pseudorandom number generator. ACM Transactions on
Modeling and Computer Simulation, 8(1):3–30. doi:10.1145/272991.272995.
Maxim/Dallas Application Note 576—HFAN-02.2.0: Extinction ratio and power
penalty [online]. (2001). Available from: http://www.maxim-ic.com/appnotes.
cfm/an_pk/596.
Maxim/Dallas Semiconductor MAX3805 10.7Gbps adaptive receive equalizer data
sheet [online]. (2006). Available from: http://datasheets.maxim-ic.com/en/ds/
MAX3805.pdf.
Maxwell, J. C. (1865). A dynamical theory of the electromagnetic field. Philosophical
Transactions of the Royal Society of London, 155:459–512.
Mayer, H. (1936). Automatic selective fading control circuits. U.S. Patent 2 054 657.
Available from: http://www.pat2pdf.org/pat2pdf/foo.pl?number=2054657.
Mena, P. V., Morikuni, J. J., Kang, S.-M., Harton, A. V., and Wyatt, K. W. (1999). A com-
prehensive circuit-level model of vertical-cavity surface-emitting lasers. IEEE/OSA
Journal of Lightwave Technology, 17(12):2612–2632. doi:10.1109/50.809684.
References 161
Mieyeville, F., Brière, M., O’Connor, I., Gaffiot, F., and Jacquemod, G. (2004). Languages
for system specification: Selected contributions on UML, systemC, system Verilog, mixed-
signal systems, and property specification from FDL’03, chapter A VHDL-AMS library
of hierarchical optoelectronic device models, pages 183–199. Kluwer Academic
Publishers, Norwell, MA, USA. ISBN: 978-1-4020-7990-0.
Miller, D. A. B. (2000). Rationale and challenges for optical interconnects to electronic
chips. Proceedings of the IEEE, 88(6):728–749. doi:10.1109/5.867687.
Miller, D. A. B., Chemla, D. S., Damen, T. C., Gossard, A. C., Wiegmann, W., Wood,
T. H., and Burrus, C. A. (1984). Band-edge electroabsorption in quantum well
structures: The quantum-confined Stark effect. Physical Review Letters, 53(22):2173–
2176. doi:10.1103/PhysRevLett.53.2173.
Miller, D. A. B. and Özaktaş, H. M. (1997). Limit to the bit-rate capacity of electrical
interconnects from the aspect ratio of the system architecture. Journal of Parallel and
Distributed Computing, 41(1):42–52. doi:10.1006/jpdc.1996.1285.
Miller, S. E., Marcatili, E. A. J., and Li, T. (1973). Research toward optical-fiber
transmission systems. Proceedings of the IEEE, 61(12):1703–1704.
Mohammed, E., Alduino, A., Thomas, T., Braunisch, H., Lu, D., Heck, J., Liu, A., Young,
I., Barnett, B., Vandentop, G., and Mooney, R. (2004). Optical interconnect system
integration for ultra-short-reach applications. Intel Technology Journal, 8(2):115–127.
Montanaro, J., Witel, R. T., Anne, K., Black, A. J., Cooper, E. M., Dobberpuhl, D. W.,
Donahue, P. M., Eno, J., Hoeppner, G. W., Kruckemyer, D., Lee, T. H., Lin, P. C. M.,
Madden, L., Murray, D., Pearce, M. H., Santhanam, S., Snyder, K. J., Stephany, R., and
Thierauf, S. C. (1996). A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor. IEEE
Journal of Solid-State Circuits, 31(11):1703–1714. doi:10.1109/JSSC.1996.542315.
Moore, Gordon, E. (1965). Cramming more components onto integrated circuits.
Electronics, 38(8):144–116.
Moriki, K., Nakahara, K., Hattori, T., and Iga, K. (1988). Single transverse mode
condition of surface-emitting injection lasers. Electronics and Communications in Japan
(Part II: Electronics), 71(1):81–90.
Morikuni, J. J., Mena, P. V., Harton, A. V., Wyatt, K. W., and Kang, S.-M. (1999). Spatially
independent VCSEL models for the simulation of diffusive turn-off transients.
IEEE/OSA Journal of Lightwave Technology, 17(1):95–102. doi:10.1109/50.737427.
Naeemi, A., Xu, J., Mulé, A. V., Gaylord, T. K., and Meindl, J. D. (2004). Optical and elec-
trical interconnect partition length based on chip-to-chip bandwidth maximization.
IEEE Photonics Technology Letters, 16(4):1221–1223. doi:10.1109/LPT.2004.824623.
Nagata, M., Nagai, J., Morie, T., and Iwata, A. (2000). Measurements and anal-
yses of substrate noise waveform in mixed-signal IC environment. IEEE Trans-
actions on Computer-Aided Design of Integrated Circuits and Systems, 19(6):671–678.
doi:10.1109/43.848088.
162 References
Nagata, M., Nagai, K., Hijikata, K., Morie, T., and Iwata, A. (2001). Physical design
guides for substrate noise reduction in CMOS digital circuits. IEEE Journal of
Solid-State Circuits, 36(3):539–549. doi:10.1109/4.910494.
Nakwaski, W. (1996). Thermal aspects of efficient operation of vertical-
cavity surface-emitting lasers. Optical and Quantum Electronics, 28(4):335–352.
doi:10.1007/BF00287023.
Neefs, H. (2000a). Latentiebeheersing in Processors – en implicaties op de opportu-
niteit van optische interconnecties (Latency management in Processors – and implica-
tions on the sense of optical interconnections). PhD thesis, Ghent University, Bel-
gium. Available from: http://aleph.ugent.be/F/?func=direct&doc_library=
RUG01&doc_number=001043895.
Neefs, H., editor (2000b). Optoelectronic Interconnects for Integrated Circuits: Achieve-
ments 1996–2000. Office for Official Publications of the European Communities,
Luxembourg. ISBN: 978-92-828-9477-4. Available from: http://bookshop.europa.
eu/eGetRecords?Template=Test_EUB/en_publication_details&UID=251079.
O’Connor, I., Tissafi-Drissi, F., Gaffiot, F., Dambre, J., De Wilde, M., Van Campenhout,
J., Van Thourhout, D., Van Campenhout, J., and Stroobandt, D. (2007). System-
atic simulation-based predictive synthesis of integrated optical interconnect. IEEE
Transactions on Very Large Scale Integration (VLSI) Systems. To be published.
O’Connor, I., Tissafi-Drissi, F., Navarro, D., Mieyeville, F., Gaffiot, F., Dambre, J.,
De Wilde, M., Stroobandt, D., and Briere, M. (2006a). Integrated optical interconnect
for on-chip data transport. In Proceedings of the 4th IEEE North-East Workshop on
Circuits and Systems (NEWCAS 2006), pages 209–212, Gatineau, Canada. ISBN: 978-
1-4244-0416-2. Available from: http://escher.elis.ugent.be/publ/Edocs/DOC/
P106_142.pdf, doi:10.1109/NEWCAS.2006.250937.
O’Connor, I., Tissafi-Drissi, F., Navarro, D., Mieyeville, F., Gaffiot, F., Dambre, J.,
De Wilde, M., Stroobandt, D., and Van Thourhout, D. (2006b). Optical interconnect
for on-chip data communication. In Workshop on Future Interconnect and Networks on
Chip (NoC) at the 2006 Design, Automation and Test in Europe (DATE 2006) Conference,
volume Poster Abstracts, page 74, Munich, Germany. Available from: http://async.
org.uk/noc2006/pdf/Proceedings_Poster_Abstracts.pdf.
O’Connor, I., Tissafi-Drissi, F., Stroobandt, D., Dambre, J., and De Wilde, M. PICMOS
(photonic interconnect layer on CMOS by waferscale integration) deliverable 1.2:
Definition and quantification of specifications for different subcomponents [online].
(2006). Available from: http://picmos.intec.ugent.be/fileadmin/projectdata/
deliverables/D12_draft.pdf.
OIIC—European Commission Esprit project 22641: Generic approach to manufac-
turable optoelectronic interconnects for VLSI circuits [online]. (1996–2000). Available
from: http://oiic.intec.ugent.be/.
Ossicini, S., Pavesi, L., and Priolo, F. (2003). Light Emitting Silicon for Microphotonics,
volume 194 of Springer Tracts in Modern Physics. Springer, Berlin, Germany. ISBN:
978-3-540-40233-6.
References 163
Ottevaere, H. (2003). Refractive Microlenses and Micro-optical Structures for Multi-parameter
Sensing: A Touch of Micro-Photonics. PhD thesis, Vrije Universiteit Brussel, Belgium.
Owens, B. E., Adluri, S., Birrer, P., Shreeve, R., Arunachalam, S. K., Mayaram,
K., and Fiez, T. S. (2005). Simulation and measurement of supply and sub-
strate noise in mixed-signal ICs. IEEE Journal of Solid-State Circuits, 40(2):382–391.
doi:10.1109/JSSC.2004.841039.
Paniccia, M., Krutul, V., Jones, R., Cohen, O., Bowers, J., Fang, A., and Park, H. A
hybrid silicon laser: Silicon photonics technology for future tera-scale computing
[online]. (2006). Available from: ftp://download.intel.com/research/platform/
sp/hl_wp1.pdf. Intel White Paper.
Pêcheux, F., Lallement, C., and Vachoux, A. (2005). VHDL-AMS and Verilog-AMS as
alternative hardware description languages for efficient modeling of multidiscipline
systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,
24(2):204–225. doi:10.1109/TCAD.2004.841071.
PCI Express [online]. (introduced in 2004). Available from: http://www.pcisig.com/
specifications/pciexpress/.
Pepeljugoski, P. K. and Kuchta, D. M. (2003). Design of optical communica-
tions data links. IBM Journal of Research and Development, 47(2/3):223–237.
doi:10.1147/rd.472.0223.
PICMOS—European Commision FP6 Information Society Technologies Programme
contract FP6-2002-IST-1-002131: Photonic interconnect layer on CMOS by waferscale
integration [online]. (2004–2006). Available from: http://picmos.intec.ugent.
be/.
POS-PHY Level 4 [online]. (specification fixed in 2001). Available from: http://www.
pmc-sierra.com/posphylevel4.
RapidIO [online]. (first specification in 1999). Available from: http://www.rapidio.
org/.
Lord Rayleigh (1897). On the passage of electric waves through tubes, or the vibration
of dielectric cylinders. Philosophical Magazine, 43:125–132.
Razavi, B. (2003). Design of Integrated Circuits for Optical Communications. McGraw-Hill.
ISBN: 978-0-07-282258-8.
Rho, B. S., Kang, S., Cho, H. S., Park, H.-H., Ha, S.-W., and Rhee, B.-H. (2004).
PCB-compatible optical interconnection using 45°-ended connection rods and via-
holed waveguides. IEEE/OSA Journal of Lightwave Technology, 22(9):2128–2134.
doi:10.1109/JLT.2004.833533.
Richardson, W. H. (1972). Bayesian-based iterative method of image restoration. Journal
of the Optical Society of America, 62:55–59.
Rits, O., Bockstaele, R., and Baets, R. (2004). Packaging solution with a 3-dimensional
coupling scheme for direct optical interconnects to the chip. In Proceedings of the 9th
Annual Symposium of the IEEE Lasers and Electro-Optics Society (LEOS) Benelux Chapter,
pages 279–282.
164 References
Rits, O., De Wilde, M., Roelkens, G., Bockstaele, R., Annen, R., Bossard, M., Marion, F.,
and Baets, R. (2006). 2D parallel optical interconnects between CMOS ICs. In Eldada,
L. A. and Lee, E.-H., editors, Optoelectronic Integrated Circuits VIII, volume 6124 of
Proceedings of SPIE, pages 168–179. ISBN: 978-0-8194-6166-7. Available from: http://
escher.elis.ugent.be/publ/Edocs/DOC/P106_035.pdf, doi:10.1117/12.652108.
Roelkens, G., Brouckaert, J., Van Thourhout, D., Baets, R., Nötzel, R., and Smit, M.
(2006). Adhesive bonding of InP/InGaAsP dies to processed silicon-on-insulator
wafers using DVS-bis-benzocyclobutene. Journal of The Electrochemical Society,
153(12):G1015–G1019. doi:10.1149/1.2352045.
Roelkens, G., Van Thourhout, D., and Baets, R. (2005). Coupling schemes
for heterogeneous integration of III-V membrane devices and silicon-on-
insulator waveguides. IEEE/OSA Journal of Lightwave Technology, 23(11):3827–3831.
doi:10.1109/JLT.2005.856231.
Rong, H., Jones, R., Lui, A., Cohen, O., Hak, D., Fang, A., and Paniccia,
M. (2005). A continuous-wave Raman silicon laser. Nature, 433:725–728.
doi:10.1038/nature03346.
Rubinstein, J., Penfield, P., and Horowitz, M. A. (1983). Signal delay in RC tree
networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and
Systems, 2(3):202–211.
Sabet, P. B. and Ilponse, F. (2001). Modeling crosstalk noise for deep submicron verifica-
tion tools. In Proceedings of the 2001 Design, Automation and Test in Europe (DATE 2001)
Conference, pages 530–534, Munich, Germany. doi:10.1109/DATE.2001.915074.
Sadi, T., Kelsall, R., and Pilgrim, N. (2006). Simulation of electron transport in
InGaAs/AlGaAs HEMTs using an electrothermal Monte Carlo method. IEEE Trans-
actions on Electron Devices, 53(8):1768–1774. doi:10.1109/TED.2006.877698.
Schares, L., Kash, J. A., Doany, F. E., Schow, C. L., Schuster, C., Kuchta, D. M.,
Pepeljugoski, P. K., Trewhella, J. M., Baks, C. W., John, R. A., Shan, L., Kwark, Y. H.,
Budd, R. A., Chiniwalla, P., Libsch, F. R., Rosner, J., Tsang, C. K., Patel, C. S., Schaub,
J. D., Dangel, R., Horst, F., Offrein, B. J., Kucharski, D., Guckenberger, D., Hegde, S.,
Nyikal, H., Lin, C.-K., Tandon, A., Trott, G. R., Nystrom, M., Bour, D. P., Tan, M. R. T.,
and Dolfi, D. W. (2006). Terabus: Terabit/second-class card-level optical interconnect
technologies. IEEE Journal of Selected Topics in Quantum Electronics, 12(5):1032–1044.
doi:10.1109/JSTQE.2006.881906.
Scott, J. W., Geels, R. S., Corzine, S. W., and Coldren, L. A. (1993). Modeling
temperature effects and spatial hole burning to optimize vertical-cavity surface-
emitting laser performance. IEEE Journal of Quantum Electronics, 29(5):1295–1308.
doi:10.1109/3.236145.
Shockley, W. (1950). Electrons and Holes in Semiconductors: With Applications to Transistor
Electronics. Van Nostrand, New York, NY, USA. ISBN: 978-0-88275-382-9 (reprint by
Krieger Publishing Company).
Sialm, G., Erni, D., Vez, D., Kromer, C., Ellinger, F., Bona, G.-L., Morf, T., and Jäckel,
H. (2005). Tradeoffs of vertical-cavity surface emitting lasers modeling for the
References 165
development of driver circuits in short distance optical links. Optical Engineering,
44(10):105401. doi:10.1117/1.2087267.
Sialm, G., Kromer, C., Ellinger, F., Morf, T., Erni, D., and Jackel, H. (2006). De-
sign of low-power fast VCSEL drivers for high-density links in 90-nm SOI
CMOS. IEEE Transactions on Microwave Theory and Techniques, 54(1):65–73.
doi:10.1109/TMTT.2005.860893.
Simon, R. and Agarwal, G. S. (2000). Wigner representation of Laguerre-Gaussian
beams. Optics Letters, 25(18):1313–1315.
Simpson, M. L., Ericson, M. N., Jellison, G. E., J., Dress, W. B., Wintenberg, A. L., and Bo-
brek, M. (1999). Application specific spectral response with CMOS compatible photo-
diodes. IEEE Transactions on Electron Devices, 46(5):905–913. doi:10.1109/16.760396.
SNAP12: 12–channel pluggable optical module multi-source agreement specifica-
tions, revision 1.1 [online]. (2002). Available from: http://www.physik.unizh.ch/
~avollhar/snap12msa_051502.pdf.
Soda, H., Iga, K., Kitahara, C., and Suematsu, Y. (1979). GaInAsP/InP surface
emitting injection lasers. Japanese Journal of Applied Physics, 18(12):2329–2330.
doi:10.1143/JJAP.18.2329.
Souders, T. M., Flach, D. R., Hagwood, C., and Yang, G. L. (1990). The effects of timing
jitter in sampling systems. IEEE Transactions on Instrumentation and Measurement,
39(1):80–85. doi:10.1109/19.50421.
SPICE: Simulation Program with Integrated Circuits Emphasis [online]. (first version in
1975). Available from: http://bwrc.eecs.berkeley.edu/Classes/IcBook/SPICE/.
Statz, H. and deMars, G. A. (1960). Transients and oscillation pulses in masers. In
Townes, C. H., editor, Quantum Electronics, pages 530–537, New York, NY, USA.
Columbia University Press.
Straub, P. (2004). Ppc-optoboard: glass sheet waveguides in PCBs. In Workshop on
Parallel Optical Interconnects Inside Electronic Systems at the 2004 Design, Automation
and Test in Europe (DATE 2004) Conference, Paris, France. Available from: http:
//io.intec.ugent.be/DATEworkshop/PPC_presentation.ppt.
Su, D. K., Loinaz, M. J., Masui, S., and Wooley, B. A. (1993). Experimental results and
modeling techniques for substrate noise in mixed-signal integrated circuits. IEEE
Journal of Solid-State Circuits, 28(4):420–430. doi:10.1109/4.210024.
Thienpont, H., Debaes, C., Baukens, V., Ottevaere, H., Vynck, P., Tuteleers, P., Verschaf-
felt, G., Volckaerts, B., Hermanne, A., and Hanney, M. (2000). Plastic microoptical
interconnection modules for parallel free-space inter and intra-MCM data communi-
cation. Proceedings of the IEEE, 88(6):769–779. doi:10.1109/5.867691.
Tissafi-Drissi, F., O’Connor, I., and Gaffiot, F. (2004). RUNE: platform for automated
design of integrated multi-domain systems application to high-speed CMOS pho-
toreceiver front-ends. In Proceedings of the 2004 Design, Automation and Test in Europe
(DATE 2004) Conference, volume 3, pages 16–21, Paris, France. ISBN: 978-0-7695-2085-
8. doi:10.1109/DATE.2004.1269191.
166 References
Toffano, Z., Pez, M., Desgreys, P., Hervé, Y., Le Brun, C., Mollier, J.-C., Barbary, G.,
Charlot, J.-J., Constant, S., Destrez, A., Karray, M., Marec, M., Rissons, A., and
Snaidero, S. (2003). Multilevel behavioral simulation of VCSEL-based optoelec-
tronic modules. IEEE Journal of Selected Topics in Quantum Electronics, 9(3):949–960.
doi:10.1109/JSTQE.2003.818348.
Tong, Q.-Y. and Gösele, U. (1999). Semiconductor Wafer Bonding: Science and Technology.
The Electrochemical Society Series. John Wiley & Sons, New York, NY, USA. ISBN:
978-0-471-57481-1.
Tummala, R. (2001). Fundamentals of Microsystems Packaging. McGraw-Hill, New York,
USA, first edition. ISBN: 978-0-07-137169-8.
Uhlig, S. and Robertsson, M. (2006). Limitations to and solutions for optical loss
in optical backplanes. IEEE/OSA Journal of Lightwave Technology, 24(4):1710–1724.
doi:10.1109/JLT.2006.870978.
Valle, A., Sarma, J., and Shore, K. (1995). Spatial holeburning effects on the dynamics
of vertical cavity surface-emitting laser diodes. IEEE Journal of Quantum Electronics,
31(8):1423–1431. doi:10.1109/3.400393.
Van Campenhout, J. M. (1998). Optics and the CMOS interconnection problem: a
systems and circuits perspective. International Journal of Optoelectronics, 12(4):145–154.
Van Campenhout, J. M. (2000). Computing structures and optical interconnect: friends
or foes? In Bains, S. and Irakliotis, L. J., editors, Critical Technologies for the Future
of Computing, volume 4109 of Proceedings of SPIE, pages 206–216. ISBN: 978-0-8194-
3754-9. Available from: http://escher.elis.ugent.be/publ/Edocs/DOC/P100_
129.pdf.
Van Campenhout, J. M., Brunfaut, M., Meeus, W., Dambre, J., and De Wilde, M. (2001).
Sense and nonsense of logic-level optical interconnect: reflections on an experiment.
In Taghizadeh, M., Thienpont, H., and Jabbour, G. E., editors, Micro- and Nano-optics
for Optical Interconnection and Information Processing, volume 4455 of Proceedings of
SPIE, pages 151–159. ISBN: 978-0-8194-4169-0. Available from: http://escher.
elis.ugent.be/publ/Edocs/DOC/P101_145.pdf, doi:10.1117/12.450436.
Van Steenberge, G., Hendrickx, N., Bosman, E., Van Erps, J., Thienpont, H., and
Van Daele, P. (2006). Laser ablation of parallel optical interconnect waveguides. IEEE
Photonics Technology Letters, 18(9):1106–1108. doi:10.1109/LPT.2006.873357.
Venditti, M. B. and Plant, D. V. (2003). On the design of large receiver and trans-
mitter arrays for OE-VLSI applications. IEEE/OSA Journal of Lightwave Technology,
21(12):3406–3416. doi:10.1109/JLT.2003.822234.
Venditti, M. B., Schwartz, J. D., and Plant, D. V. (2004). Skew reduction for synchronous
OE-VLSI receiver applications. IEEE Photonics Technology Letters, 16(6):1552–1554.
doi:10.1109/LPT.2004.827100.
Verilog-AMS [online]. (first version in 1998). Available from: http://www.eda.org/
verilog-ams/.
References 167
Verschaffelt, G. (2000). Vertical-Cavity Surface Emitting Lasers for Parallel Optical datacom-
munication. PhD thesis, Vrije Universiteit Brussel, Belgium.
Verspecht, J. (1994). Compensation of timing jitter-induced distortion of sampled
waveforms. IEEE Transactions on Instrumentation and Measurement, 43(5):726–732.
doi:10.1109/19.328895.
VHDL-AMS (IEEE 1076.1) [online]. (standardized in 1999). Available from: http:
//www.eda.org/vhdl-ams/.
Wang, G., Tokumitsu, T., Hanawa, I., Yoneda, Y., Sato, K., and Kobayashi,
M. (2003). A time-delay equivalent-circuit model of ultrafast p-i-n photodi-
odes. IEEE Transactions on Microwave Theory and Techniques, 51(4):1227–1233.
doi:10.1109/TMTT.2003.809642.
Watanabe, Y. and Nishizawa, J.-I. (application date Dec. 20, 1950). Semiconductor
devices with high resistivity thin layer [in Japanese]. Japanese Patent 205 068,
published no. 28-6077.
Widmer, A. X. and Franaszek, P. A. (1983). A DC-balanced, partitioned-block, 8B/10B
transmission code. IBM Journal of Research and Development, 27(5):440–451.
Windisch, R., Dutta, B., Kuijk, M., Knobloch, A., Meinlschmidt, S., Schoberth, S., Kiesel,
P., Borghs, G., Dohler, G., and Heremans, P. (2000). 40optimization of natural lithogra-
phy. IEEE Transactions on Electron Devices, 47(7):1492–1498. doi:10.1109/16.848298.
Winkler, C., Love, J. D., and Ghatak, A. K. (1979). Loss calculations in bent
multimode optical waveguides. Optical and Quantum Electronics, 11(2):173–183.
doi:10.1007/BF00624396.
Yablonovitch, E. (1987). Inhibited spontaneous emission in solid-state
physics and electronics. Physical Review Letters, 58(20):2059–2062.
doi:10.1103/PhysRevLett.58.2059.
Yoon, K. B., Cho, I.-K., Ahn, S. H., Jeong, M. Y., Lee, D. J., Heo, Y. U., Rho, B. S., Park,
H.-H., and Rhee, B.-H. (2004). Optical backplane system using waveguide-embedded
PCBs and optical slots. IEEE/OSA Journal of Lightwave Technology, 22(9):2119–2127.
doi:10.1109/JLT.2004.833826.
Yu, S. F. (2003). Analysis and Design of Vertical Cavity Surface Emitting Lasers. Wiley
Series in Lasers and Applications. John Wiley & Sons, Hoboken, NJ, USA. ISBN:
978-0-471-39124-1. doi:10.1002/0471723789.
Yuwaka, A. (1985). A CMOS 8-bit high-speed A/D converter IC. IEEE Journal of
Solid-State Circuits, 20(3):775–779.
Zhang, H., Mrozynski, G., Wallrabenstein, A., and Schrage, J. (2004). Analysis of
transverse mode competition of VCSELs based on a spatially independent model.
IEEE Journal of Quantum Electronics, 40(1):18–24. doi:10.1109/JQE.2003.820842.
Zhang, J. M. and Conn, D. R. (1992). State-space modeling of the PIN photodetector.
IEEE/OSA Journal of Lightwave Technology, 10(5):603–609. doi:10.1109/50.136094.
168 References
Zimmermann, H. (2004). Silicon Photonics, volume 94 of Topics in Applied Physics,
chapter Silicon Photo-Receivers, pages 239–268. Springer Berlin/Heidelberg. ISBN:
978-3-540-21022-1.
Zobel, O. J. (1926). Electrical network and method of transmitting electric currents.
U.S. Patent 1 603 305. Available from: http://www.pat2pdf.org/pat2pdf/foo.pl?
number=1603305.
Zubia, J. and Arrue, J. (2001). Plastic optical fibers: An introduction to their
technological processes and applications. Optical Fiber Technology, 7(2):101–140.
doi:10.1006/ofte.2000.0355.
Appendix A
Project background
A.1 Introduction
This thesis has been performed in the context of the Interconnect by Optics (IO)
project of the European Commission’s Fifth Framework Programme, which
ran from the end of 2001 to early 2005. We start by giving due credit to its
ancestral project (1996-2000): the Optically Interconnected ICs (OIIC) project
of the European Commission’s Esprit Programme.
A.2 OIIC project
The OIIC project was defined as a rather broad OE-VLSI exploration project,
with the following official objectives (the references indicated are ensuing PhD
dissertations):
• (summarizing objective) to identify solutions for the interconnect bottle-
neck expected in future generation CMOS ICs
• to establish key technologies to implement two-dimensional optical
interconnects [Verschaffelt, 2000; Bockstaele, 2001; Baukens, 2001; Annen,
2002; Ottevaere, 2003; Debaes, 2003; Coosemans, 2006]
• to assess cost effectiveness and manufacturability of the proposed inter-
connect scheme and compare this to other approaches
• to identify new processing architectures exploiting the benefits of OE-
VLSI interconnect [Neefs, 2000a; Bui Viet, 2004]
Discussing all results here would lead us too far; we limit ourselves to an ac-
count of the technological outcome. The technological effort of the OIIC project
170 Project background
Figure A.1: Guided-wave OE-VLSI demonstrator of the OIIC project. The 4×8 VCSEL
and photodiode arrays are clearly discernible on the inset.
has led to the exploration and realization of several OE-VLSI approaches. The
optical path effort has resulted in several intra-package free-space interconnect
realizations [Thienpont et al., 2000; Baukens, 2001; Debaes, 2003] and an inter-
package link technology employing plastic optical fiber (POF) connectorized
bundles (see figure A.1) [Jöhnck et al., 1998; Coosemans, 2006]. Regarding
optical emitters, links using VCSELS [King et al., 1998] and Resonant Cavity
Light Emitting Diodes (RCLEDs) [Bockstaele et al., 1999; Bockstaele, 2001]
have been demonstrated, and efficient nonresonant cavity LEDs have been
realized as well [Windisch et al., 2000].
The main demonstrator employed optical emitters and detectors organized
in two-dimensional arrays on a 250–µm device pitch, flip-chip bonded onto
a dedicated CMOS field-programmable gate array (FPGA) [Brunfaut et al.,
2001]. Channel bitrates of about 100Mbps have been achieved using 4×8
VCSEL arrays, up to 250Mbps using 8×8 RCLED arrays, and up to 622Mbps
on a high-speed 2×8-parallel VCSEL array demonstrator.
A.2.1 IO project
Supported by the OIIC project experience, the IO project emerged in 2001
as an industrialization project for chip-to-chip interconnect. The partners
involved were IMEC (demonstrator design and OE-VLSI packaging), Avalon
Photonics (lasers), Optospeed/Albis Optoelectronics AG (photodiodes), Helix
AG (driver/receiver circuits), CEA-LETI (flip-chip hybridization and OE-VLSI
packaging), FCI (connectors), Nexans (POF), RCI (POF wiring), PPC Electron-
ics AG (PCB-integrated waveguides) and Alcatel Bell (Xantium router)/Alcatel
CIT (deployment study).
The main objectives of this inter-chip optical interconnect project were
A.3 Relevant IO project developments 171
• to improve performance-wise by increasing the channel bit rate to the
multi-Gbps range and the number of parallel channels to 16×16
• to develop an industrially viable manufacturing methodology—originally
including the tight integration of the optical path in PCBs and rack-based
setups, besides the package-level integration obviously required
• to demonstrate the viability of the developed optical interconnect
methodology by its integration into the Alcatel Xantium system, a
prominent terabit IP core router of that time
The IO project did reach the intended bit rate and parallelism increase [Rits
et al., 2006]. The O-E conversion arrays and their integration with CMOS
ICs into optically accessible packages have produced OE-VLSI modules with
high yield and electrical, optical and mechanical performances meeting the
requirements for a reliable interconnect.
A.3 Relevant IO project developments
In this section we present hardware system developments of the IO project
relevant for this work (note that not nearly all project developments are
mentioned here).
A.3.1 DTA IC
The first move of the IO project was the establishment of the main demon-
strator architecture and the development of the main demonstrator IC. Direct
OE-VLSI integration on the Xantium core routing chip proved to be no option
as its 0.25–µm CMOS (ST foundry) technology was not available to the project.
The test chip in 0.35–µm CMOS (AMS foundry)—named Digital Technology
Assessment (DTA) IC—would instead be placed next to the Xantium chip
and primarily perform a 16–channel O-E conversion in two directions at a
1.25Gbps line rate.
This DTA IC (represented in picture A.2) evidently employed a true OE-VLSI
construction, harboring a 8×8-channel driver circuit array (at a 250–µm pitch)
and a similar receiver circuit array on its 12-by-12mm surface, onto which
VCSEL and photodiode arrays were to be flip-chip bonded. Besides the O-E
interfacing circuits, the DTA CMOS implemented 48–channel bitstream resyn-
chronization and test pattern generation provisions, an adaptable permutation
network linking optical and electrical inputs to outputs alike, and an on-chip
bit error ratio (BER) tester. This approach per se permitted the parallel techno-
logical testing of all 64 optical channels (in each direction) between two DTA
ICs in the presence of only 16 external electrical inputs and outputs per DTA.
172 Project background
8×8
receiver
circuits
8×8
driver
circuits
pe
rm
u
ta
tio
n
 
sw
itc
h
pe
rm
u
ta
tio
n
 
sw
itc
h
resynchronization
pattern generation
SerDes
resynchronization
pattern generation
SerDes B
ERBE
R RAM
RAM
(a) Photograph
64
 
o
pt
ic
a
l i
n
16
 
e
le
ct
ric
a
l i
n
16
 
e
le
ct
ric
a
l o
u
t
64
 
o
pt
ic
a
l o
u
t
permutation
switch
72×72
48×
resynchronization
pattern generation
SerDes
BER RAM
4
4
20
36
8
4
4
20
48 8
8
88
8
8
56
8
4
4
(b) Architecture
Figure A.2: Photograph and main architecture of the 0.35–µm CMOS demonstrator
chip, going by the name of Digital Technology Assessment (DTA) IC
A.3 Relevant IO project developments 173
For clarity’s sake we emphasize that there is no net (de)serialization: one
external electrical input is converted to one optical output and backwards;
just only maximally 16 out of 64 optical links could carry live external data
while the others would be silent or transmitting test patterns.
A.3.2 Demonstrator PCBs
Three PCB designs were developed for the system demonstrator of the IO
project with three PCB designs: a backplane with pluggable ‘converter’ boards
and ‘switch’ boards.
The original intent is rendered in figure A.3. On a converter board (figure
A.4), 16 electrical front side inputs can be directed to 64 optical outputs of one
DTA IC (which should be further optically routed to the backplane), with a
similar connection in the reverse direction.
The switch board was planned to contain 2 DTA ICs, each directing 64 optical
backplane inputs and outputs to a single Xantium router chip over 16 electrical
input and as many output lanes. The Xantium chip would then demonstrate
its routing functionality between the connections of both DTA ICs.
The backplane was appointed to implement—besides power and control
provisioning—the optical path connection over cables or fiber-embedding
flexible foils: either between two converter boards mutually, or between two
converter boards and a switch board.
A powerful test and measurement interface has been implemented in firmware
on FPGAs dedicated to each DTA IC, mutually communicating over the
backplane, and linked with a host computer running in-house test software
through a serial connection.
The development of the switch board was discontinued after the whole Xan-
tium IP router project was shut down by Alcatel Bell. The other boards were
finished as planned. The backplanes and associated racks are shown in figure
A.5; however most testing has been performed on and between the DTA con-
verter boards which could also operate without a backplane. Tests performed
on three experimental DTA converter boards lie at the basis of the stochastic
system modeling and characterization research effort of chapter 4.
A.3.3 High-speed design in 0.18µm CMOS
Besides the IO system demonstrator PCBs and DTA modules, the project
objectives also envisaged the implementation of 2.5Gbps line rates and 16×16
array sizes. However, the excessive cost of a full CMOS run in a state-of-the-art
technology excluded the possibility of a new OE-VLSI design.
To cope with this issue, a clever solution was devised: the separation of the
objectives (implantation on actual CMOS, 2.5Gbps line rate and 16×16 array
size) in different demonstrations, together much less expensive than a new
174 Project background
backplane
DTA
con
ve
ter b
a d
r
o r
61
16
el c
tricae
l
p icao t
l DTA
c n
ve
ter b
a d
o
r
 
o r
61
16
el c
tricae
l
p icao t
l
64
64
(a) connectivity between 2 converter boards mutually
backplane
DTA
con
ve
ter b
a d
r
o r
wi c
h o
r
s
t
b a
d
DTA
46
61
16
el c
tricae
l
16
16
16
61
46
p icao t
l
DTA
DTA
c n
ve
ter b
a d
o
r
 
o r
64
61
16
el c
tricae
l
64
p icao t
lt u
Xan
i m
r u
i
o t
ng
CI
(b) connectivity between 2 converter boards and a switch board
COM
ISE0
SM256
ISE1
ISE2 ISE3
ISE4 ISE5
ISE6 ISE7
DC/DC
PC/WS
GICI
box
SMA256
BPA-TSTD
XCLK TCLK test BPA
System clock
PECL
buffers
(128 x)
Conv
Conv
Switch
-48V supply
PC/WS
System clock
622Mbps S-link
Serial interface
P0
P1
P2
P3
Alcatel Xantium system IO Project demonstrator
-48 V supply
(c) interconnection of the demonstrator and the Alcatel Xantium system
Figure A.3: Original connectivity intent for the demonstrator PCBs in the IO project
A.3 Relevant IO project developments 175
e
le
ct
ric
a
l i
n
pu
ts
 
a
m
d 
o
u
tp
u
ts
cooler with DTA IC
underneath
FPGA
for configuration
and testing
electrical
backplane
interface
power
circuits
serial port
cl
o
ck J
TA
G
(a) front side
DTA IC
build-up optical path (as originally planned)
(b) back side
Figure A.4: Photographs of a converter board containing a DTA IC.
176 Project background
CMOS run. The actual OE-VLSI combination was already catered for by the
DTA IC; therefore the other demonstrations employed an inexpensive metal-
on-glass substrate approach. In this way, the successful 16×16 demonstration
amounted to a simple probe-based testing of the operation and the substrate
connectivity of each device in the O-E device arrays.
For the 2.5Gbps line rate demonstration, fast CMOS was obviously inevitable.
This problem has been addressed through a smart combination of the metal-
on-glass method and an affordable 5-by-5mm multi-project wafer (MPW)
reticle in 0.18–µm CMOS (UMC foundry) technology.
Figure A.6 shows the approach: an oblong metal-on-glass carrier was used to
hold 8×8–sized VCSEL and photodiode arrays, of which the 12 most aptly
positioned devices were routed to bond pads at the long edge. This carrier
was placed in the original DTA IC package, properly fixing the array locations
to ensure compatibility with the original DTA packaging approach. Finally,
through the wire-bonding of the carrier and the package to two small CMOS
ICs cut from the reticle—one carrying 12 VCSEL high-speed driver circuits and
the other as many photodiode receiver circuits—the required O-E interfacing
circuits and the external electrical interface were implemented.
The single-dimensional adjacency of the driver and receiver circuits admittedly
allowed larger—and therefore more complex or better isolated—circuit designs
than a true two-dimensional OE-VLSI scheme; nonetheless the higher mod-
ulation capability of the O-E conversion devices was actually demonstrated
this way.
A.3 Relevant IO project developments 177
Figure A.5: Rack with visible backplane and one converter board. The holes in the
converter board and the backplane are features for a possible build-up optical path.
VCSEL
array
photodiode
array
metal-on-glass carrier
12-channel
driver IC
12-channel
receiver IC
+
substrate noise
circuits
substrate
noise
substrate
noise
12
 
dr
ive
rs
12
 
re
ce
ive
rs
Figure A.6: View into the package of the 0.18–µm CMOS demonstrator setup.
178 Project background
module with
0.18 µm CMOS ICs
FPGA
for configuration
and testing
power
circuits
differential electrical outputs
differential electrical inputs
se
ria
l p
o
rt
substrate noise
trigger input 1
substrate noise
trigger input 2
JT
AG
(a) front side
module with
0.18 µm CMOS ICs
build-up optical path (as originally planned)
(b) back side
Figure A.7: Photographs of a demonstrator board containing a 0.18–µm CMOS module.
Appendix B
Verilog-AMS simulation
models
B.1 Simplied interfacing circuits
The following code fragment corresponds to the simplified driver/receiver
models discussed in section 3.3.2.
‘include "constants.vams"
‘include "disciplines.vams"
module driver(in,out,vdd,gnd);
inout in,out,vdd,gnd;
electrical in,out,vdd,gnd;
/* the effective parameter values are overridden by the environment */
parameter real Cintx=0; /* driver input capacitance */
parameter real ts=0; /* time constant of current switch */
parameter real IMOD=0; /* modulation current */
parameter real IBIAS=0; /* bias current */
parameter real Cd=0; /* driver output capacitance */
parameter real qCd=0; /* -> proportion to ground */
parameter real Rd=0; /* driver output resistance */
parameter real Vd=0; /* expected average VCSEL voltage */
/* -> the operating voltage over C4 */
/* internal nodes */
real lowpass_signal;
analog begin
/* input capacitance */
I(in,gnd) <+ Cintx*ddt(V(in,gnd));
/* intrinsic frequency response of the driver */
lowpass_signal = laplace_zp(V(in,gnd)/V(vdd,gnd),{},{-1/ts,0});
180 Verilog-AMS simulation models
/* output current */
I(vdd,out) <+ IBIAS + IMOD + (Vd-V(out,gnd))/Rd;
+ (1-qCd)*Cd*ddt(V(vdd,out));
I(out,gnd) <+ (1-lowpass_signal)*IMOD + qCd*Cd*ddt(V(out,gnd));
/* I(vdd,gnd) could be added to model extra internal dissipation */
end
endmodule
module receiver(in,out,vdd,gnd);
inout in,out,vdd,gnd;
electrical in,out,vdd,gnd;
/* the effective parameter values are overridden by the environment */
parameter real Cinrx=0; /* receiver input capacitance */
parameter real qCinrx=0; /* -> proportion to ground */
parameter real Cpd=0; /* photodiode capacitance */
parameter real A=0; /* preamplifier gain */
parameter real fa=0; /* preamplifier bandwidth */
parameter real A2=0; /* postamplifier gain */
parameter real f2=0; /* postamplifier bandwidth */
parameter real frollon=0; /* roll-on corner frequency */
parameter real Qf=0; /* quality factor (the f was added
to avoid a name conflict) */
parameter real tdelay=0; /* intrinsic latency
(for an improved accuracy this can be
made to depend on the pk-pk
photocurrent amplitude) */
parameter real Rout=0; /* output resistance (R4 and R5) */
parameter real Vt_initial=0; /* dc starting point for C3 */
/* internally calculated parameter */
parameter real preamp_res=(1+A)/(Qf*‘M_TWO_PI*fa*(Cpd+Cinrx));
/* internal nodes */
voltage preamp, postamp, postamp_delayed, threshold;
real unclipped_signal, clipped_signal;
analog begin
@(initial_step) begin
V(postamp_delayed) <+ Vt_initial;
end
/* input capacitance */
I(in,vdd) <+ (1-qCinrx)*Cinrx*ddt(V(in,vdd));
I(in,gnd) <+ qCinrx*Cinrx*ddt(V(in,gnd));
/* preamplifier */
V(preamp,gnd) <+ A*laplace_zp(V(in,gnd),{},{-‘M_TWO_PI*fa,0});
I(in,gnd) <+ (V(in,gnd)+V(preamp,gnd))/preamp_res;
/* postamplifier */
V(postamp,gnd) <+ A2*laplace_zp(V(preamp,gnd),{},{-‘M_TWO_PI*f2,0});
V(postamp_delayed) <+ absdelay(V(postamp,gnd),tdelay);
B.2 VCSEL 181
/* adaptive threshold */
V(threshold,gnd) <+ laplace_zp(V(postamp_delayed,gnd),{},
{-‘M_TWO_PI*frollon,0});
/* comparison */
unclipped_signal = 2*(V(postamp_delayed,threshold))+I(out,gnd);
/* limiting amplifier */
case (1)
(Rout*unclipped_signal > V(vdd,gnd)):
clipped_signal = 0;
(Rout*unclipped_signal < -V(vdd,gnd)):
clipped_signal = V(vdd,gnd);
default
clipped_signal = (V(vdd,gnd)-Rout*unclipped_signal)/2;
endcase
V(out,gnd) <+ clipped_signal;
/* I(vdd,gnd) could be added to model extra internal dissipation */
end
endmodule
B.2 VCSEL
Here Verilog-AMS code fragments for VCSEL simulation are displayed. VC-
SEL modeling for behavioral simulation is discussed in section 3.3.3.
B.2.1 Multimode linear laser model
The following code is a straight implementation of the multimode VCSEL
model described by Mena et al. [1999]. Here, a spatial distribution of carriers
is implemented using a three-term series expansion; two modes have been
considered.
‘include "constants.vams"
‘include "disciplines.vams"
/* NATURES & DISCIPLINES */
discipline power
potential Power;
enddiscipline
discipline temp
potential Temperature;
enddiscipline
nature Count
units = "#";
182 Verilog-AMS simulation models
access = Cnt;
abstol = 1E-6;
endnature
nature Differ
units = "";
access = Diff;
abstol = 1E3;
endnature
discipline count
potential Count;
enddiscipline
nature Gain
units = "";
access = Gn;
abstol = 1E-3;
endnature
discipline gain
potential Gain;
enddiscipline
nature Fitter
units = "";
access = Fit;
abstol = 1E-6;
endnature
discipline fitter
potential Fitter;
flow Differ;
enddiscipline
module vcsel_2_modes(anode, cathode, out);
/* TERMINALS */
/* electrical terminals */
inout anode, cathode;
electrical anode, cathode;
/* optical terminals */
output [0:1] out;
power [0:1] out;
/* PARAMETERS */
/* gain-constant parameters */
parameter real G_0=0, a_g0=0, a_g1=0, a_g2=0, b_g0=0, b_g1=0, b_g2=0;
/* transparency-number parameters */
parameter real N_to=0, c_n0=0, c_n1=0, c_n2=0;
/* leakage parameters */
parameter real I_lo=0, a_0=0, a_1=0, a_2=0, a_3=0;
/* overlap integral values gamma_ki, lambda_ki */
parameter real gamma_00=0, gamma_01=0, gamma_02=0;
parameter real gamma_10=0, gamma_11=0, gamma_12=0;
parameter real lambda_00=0, lambda_01=0, lambda_02=0;
parameter real lambda_10=0, lambda_11=0, lambda_12=0;
B.2 VCSEL 183
/* overlap integram values phi_jki */
parameter real phi_100=0, phi_101=0, phi_102=0;
parameter real phi_110=0, phi_111=0, phi_112=0;
parameter real phi_200=0, phi_201=0, phi_202=0;
parameter real phi_210=0, phi_211=0, phi_212=0;
/* electrical parameters */
parameter real V_T=0, I_s=0, R_s=0, R_c=0, C_ox=0;
/* convergence parameters:
* P_k = (v_mk+delta_m)^2
* N_0 = z_n*(v_n0+delta_n)^2
* N_j = z_n*v_nj
*/
parameter real delta_m=0, delta_n=0, z_m=0, z_n=0;
parameter real n_to_m=z_n/z_m;
/* other parameters */
parameter real h_1=0, h_2=0, zeta_1=0, zeta_2=0, eta_i=0;
parameter real tau_n=0, tau_p0=0, tau_p1=0;
parameter real k_f0=0, k_f1=0, beta_0=0, beta_1=0;
parameter real b_0=0, b_1=0, b_2=0;
parameter real epsilon_00=0, epsilon_01=0, epsilon_10=0, epsilon_11=0;
parameter real R_th=0, tau_th=0;
/* INTERNAL NODES */
current I_l,I_o;
voltage V_o;
temp Ti;
power Pi;
power P_0,P_1;
count N_0z, N_1z, N_2z, N_tz;
count S_0z, S_1z;
gain G_ifo_T;
fitter v_n0, v_n1, v_n2, v_m0, v_m1;
/* DIFFERENTIAL EQUATIONS */
analog begin
/* optical behavior */
Cnt(N_0z) <+ pow(Fit(v_n0)+delta_n,2);
Cnt(N_1z) <+ Fit(v_n1);
Cnt(N_2z) <+ Fit(v_n2);
Pwr(P_0) <+ pow(Fit(v_m0)+delta_m,2);
Pwr(out[0]) <+ Pwr(P_0);
Pwr(P_1) <+ pow(Fit(v_m1)+delta_m,2);
Pwr(out[1]) <+ Pwr(P_1);
Cnt(S_0z) <+ Pwr(P_0)/(k_f0*z_m);
Cnt(S_1z) <+ Pwr(P_1)/(k_f1*z_m);
Gn(G_ifo_T) <+ G_0 * ((a_g0+Temp(Ti)*(a_g1+Temp(Ti)*a_g2))
/(b_g0+Temp(Ti)*(b_g1+Temp(Ti)*b_g2)));
Cnt(N_tz) <+ N_to*(c_n0+Temp(Ti)*(c_n1+Temp(Ti)*c_n2))/z_n;
if ((Cnt(N_0z)<1E-1/z_n) || (Temp(Ti)<0.01))
I(I_l) <+ 0;
184 Verilog-AMS simulation models
else
I(I_l) <+ I_lo*exp((-a_0 + z_n*Cnt(N_0z)*(a_1+a_2*Temp(Ti))
- a_3/(z_n*Cnt(N_0z)) )/(Temp(Ti)));
/* The following code is admittedly ugly *
* yet we had no choice but to fully expand matrix products */
Diff(v_n0) <+ - ddt(Cnt(N_0z))
+ (eta_i*I(I_o)-I(I_l))/(z_n*‘P_Q) - Cnt(N_0z)/tau_n
- Gn(G_ifo_T) * (((gamma_00*(Cnt(N_0z)-Cnt(N_tz))
- gamma_01*Cnt(N_1z) - gamma_02*Cnt(N_2z))*Cnt(S_0z)
/ (1/z_m + epsilon_00*Cnt(S_0z) + epsilon_10*Cnt(S_1z)))
+ ((gamma_10*(Cnt(N_0z)-Cnt(N_tz)) - gamma_11*Cnt(N_1z)
- gamma_12*Cnt(N_2z))*Cnt(S_1z)/(1/z_m + epsilon_01*Cnt(S_0z)
+ epsilon_11*Cnt(S_1z))));
Diff(v_n1) <+ - ddt(Cnt(N_1z))
- eta_i*zeta_1*I(I_o)/(z_n*‘P_Q) - (1+h_1)*Cnt(N_1z)/tau_n
+ Gn(G_ifo_T) * (((phi_100*(Cnt(N_0z)-Cnt(N_tz))
- phi_101*Cnt(N_1z) - phi_102*Cnt(N_2z))*Cnt(S_0z)
/ (1/z_m + epsilon_00*Cnt(S_0z) + epsilon_10*Cnt(S_1z)))
+ ((phi_110*(Cnt(N_0z)-Cnt(N_tz)) - phi_111*Cnt(N_1z)
- phi_112*Cnt(N_2z))*Cnt(S_1z)/(1/z_m + epsilon_01*Cnt(S_0z)
+ epsilon_11*Cnt(S_1z))));
Diff(v_n2) <+ - ddt(Cnt(N_2z))
- eta_i*zeta_2*I(I_o)/(z_n*‘P_Q) - (1+h_2)*Cnt(N_2z)/tau_n
+ Gn(G_ifo_T) * (((phi_200*(Cnt(N_0z)-Cnt(N_tz))
- phi_201*Cnt(N_1z) - phi_202*Cnt(N_2z))*Cnt(S_0z)
/ (1/z_m + epsilon_00*Cnt(S_0z) + epsilon_10*Cnt(S_1z)))
+ ((phi_210*(Cnt(N_0z)-Cnt(N_tz)) - phi_211*Cnt(N_1z)
- phi_212*Cnt(N_2z))* Cnt(S_1z)/(1/z_m + epsilon_01*Cnt(S_0z)
+ epsilon_11*Cnt(S_1z))));
Diff(v_m0) <+ - ddt(Cnt(S_0z))
- Cnt(S_0z)/tau_p0 + (b_0*Cnt(N_0z) - b_1*Cnt(N_1z)
- b_2*Cnt(N_2z))*n_to_m*beta_0/tau_n + n_to_m*Gn(G_ifo_T)
* ((lambda_00*(Cnt(N_0z)-Cnt(N_tz)) - lambda_01*Cnt(N_1z)
- lambda_02*Cnt(N_2z))*Cnt(S_0z)/(1/z_m + epsilon_00*Cnt(S_0z)
+ epsilon_10*Cnt(S_1z)) );
Diff(v_m1) <+ - ddt(Cnt(S_1z))
- Cnt(S_1z)/tau_p1 + (b_0*Cnt(N_0z) - b_1*Cnt(N_1z)
- b_2*Cnt(N_2z))*n_to_m*beta_1/tau_n + n_to_m * Gn(G_ifo_T)
* ((lambda_10*(Cnt(N_0z)-Cnt(N_tz)) - lambda_11*Cnt(N_1z)
- lambda_12*Cnt(N_2z))*Cnt(S_1z)/(1/z_m + epsilon_01*Cnt(S_0z)
+ epsilon_11*Cnt(S_1z)) );
/* thermal behavior */
Pwr(Pi) <+ V(V_o)*I(I_o)+R_c*pow(I(anode,cathode),2)
-Pwr(P_0)-Pwr(P_1);
Temp(Ti) <+ $temperature+R_th*laplace_zp(Pwr(Pi),{},{-1/tau_th,0});
/* electrical behavior */
V(V_o) <+ V_T*ln((I(I_o)/I_s)+1)+I(I_o)*R_s;
I(I_o) <+ I(anode,cathode) - C_ox*ddt(V(V_o));
V(anode,cathode) <+ V(V_o) + R_c*I(anode,cathode);
end
endmodule
B.2 VCSEL 185
B.2.2 Lumped nonlinear laser model
This code describes a lumped laser model using a nonlinear yet more accurate
gain description and including self-heating. It models a distributed Bragg re-
flector (DBR) microlaser rather than a VCSEL, yet the behavior, equations and
implementation challenges are very similar. We include the implementation
code here as it is specially crafted for convergence in the steady-state case, the
transient case as well as the periodic steady-state case. This model is fully
described in [O’Connor et al., 2006c].
‘include "constants.vams"
‘include "disciplines.vams"
discipline power
potential Power;
enddiscipline
discipline charge
potential Charge;
enddiscipline
discipline temp
potential Temperature;
enddiscipline
module DBR_usource(light, anode, cathode);
output light;
power light;
inout anode, cathode;
electrical anode, cathode;
/**** QUANTITIES ****/
power INT_LIGHT;
charge INT_C;
temp INT_T;
voltage INT_Va;
voltage INT_Vp;
/**** INPUT PARAMETERS ****/
/* DIMENSIONS */
/* active region depth, length and width (m) */
parameter real d=20e-9, L=18e-6, W=4e-6;
/* ELECTRO-OPTICAL INTERACTION MODEL */
/* laser mode frequency (Hz) */
parameter real nu=200E12;
/* photon group velocity (m/s) */
parameter real v_g=88E6;
/* surface recombination velocity (m/s) */
parameter real v_s=500;
186 Verilog-AMS simulation models
/* Shockley-Read-Hall recombination coefficient (1/s) */
parameter real A=100E6;
/* spontaneous radiative recombination coefficient (m^3/s) */
parameter real B=0.2E-15;
/* Auger recombination coefficient (m^6/s) */
parameter real C0=1.76E-39;
/* activation energy for Auger recombination (J) */
parameter real Ea=9.61E-21;
/* spontaneous emission factor */
parameter real beta=0.01;
/* confinement factor */
parameter real Gamma=0.05;
/* internal optical absorption per unit length (1/m) */
parameter real alpha_i=3.5E3;
/* injection efficiency */
parameter real eta_i=0.7;
/* additional coupling efficiency */
parameter real eta_c=0.79;
/* thermal slope of characteristic carrier density (1/(m^3*K)) */
parameter real n_Ts=7.5E21;
/* intercept of characteristic carrier density at 0 K (1/m^3) */
parameter real n_T0=-899E21;
/* thermal slope of characteristic material gain (1/(m*K)) */
parameter real G_Ts=-37.15;
/* intercept of characteristic material gain at 0K (1/m) */
parameter real G_T0=195.7E3;
/* DBR mirror reflectivity */
parameter real R_DBR=0.95;
/* reflectivity modulation depth */
parameter real r_wg=0.031;
/* coupling length (m) */
parameter real Lc=2.63e-6;
/* length offset (m) */
parameter real phi_L=2.40e-6;
/* ELECTRICAL MODEL */
/* diode saturation carrier density (1/m^3)*/
parameter real ne_diode=1e12;
B.2 VCSEL 187
/* diode emission coefficient */
parameter real N_diode=2;
/* active region unit resistance (Ohm*m) */
parameter real Ra_u=45E-3;
/* active region unit capacitance (F/m) */
parameter real Ca_u=180E-12;
/* contact series unit resistance (Ohm*m^2) */
parameter real Rs_u=4.5e-9;
/* pad parasitics unit resistance (Ohm*m) */
parameter real Rp_u=1.4;
/* pad parasitics unit capacitance (F/m) */
parameter real Cp_u=130E-12;
/* THERMAL MODEL */
/* temperature of the outside of the device (K) */
parameter real T_ambient=300;
/* operating point temperature (K) */
parameter real T_op=0;
/* thermal unit resistance (K*m/W) */
parameter real R_therm_u=3.6;
/* thermal time constant (s) */
parameter real tau_therm=1E-6;
/**** CALCULATED PARAMETERS ****/
/* DIMENSIONS */
/* active volume (m^3) */
parameter real Vol=d*L*W;
/* active surface */
parameter real S=2*(L+W)*d;
/* ELECTRO-OPTICAL INTERACTION MODEL */
/* effective reflectivity */
parameter real R_eff=R_DBR
*(1-r_wg*cos(‘M_PI_2*(L-phi_L)/Lc)*cos(‘M_PI_2*(L-phi_L)/Lc));
/* threshold material gain (1/m) */
parameter real Gth=(alpha_i+ln(1/R_eff)/L)/Gamma;
/* extraction efficiency */
parameter real eta_extr=eta_c*ln(1/(1-r_wg*(cos(‘M_PI_2*(L-phi_L)/Lc)
*cos(‘M_PI_2*(L-phi_L)/Lc))))/(Gamma*L*Gth);
/* optical absorption in mirrors per unit length (1/m) */
parameter real alpha_m=ln(1/R_eff/R_eff)/(2*L);
188 Verilog-AMS simulation models
/* total optical absorption per unit length (1/m) */
parameter real alpha=alpha_i+alpha_m;
/* photon recombination rate (1/s) */
parameter real Rph=alpha*v_g;
/* mode coupling factor (W*m^3) */
parameter real k=eta_extr*‘P_H*nu*Vol/Gamma*Rph;
/* ELECTRICAL MODEL */
/* active region resistance (Ohm) */
parameter real Ra=Ra_u*d/(L*W);
/* active region capacitance (F) */
parameter real Ca=Ca_u*L*W/d;
/* contact series resistance (Ohm) */
parameter real Rs=Rs_u/(L*W);
/* pad parasitic resistance (Ohm) */
parameter real Rp=Rp_u*d/(L*W);
/* pad parasitic capacitance (F) */
parameter real Cp=Cp_u*L*W/d;
/* THERMAL MODEL */
/* thermal resistance (K/W) */
parameter real R_therm=R_therm_u*d/(L*W);
/* INTERMEDIATE CALCULATIONS */
parameter real Ce_diode=ne_diode*‘P_Q*Vol;
parameter real c1=A+v_s*S/Vol;
parameter real c2=B/(‘P_Q*Vol);
parameter real c30=C0/(‘P_Q*Vol)/(‘P_Q*Vol);
parameter real c_LIGHT=(‘P_Q*Vol);
parameter real l_LIGHT=k;
parameter real l_C=B*beta*Gamma*k/(‘P_Q*Vol)/(‘P_Q*Vol);
/**** VARIABLES ****/
real TMP_LIGHT, TMP_C, TMP_T, TMP_Va, TMP_Vp, TMP_I;
real RESULT_LIGHT, RESULT_C, RESULT_T, RESULT_Va, RESULT_Vp, RESULT_V;
real V_diode, I_diode, Ia, Ip, dissipation, n0, G0, c3, G, Cn0;
real R_thermZ, T_ambientZ;
/**** EQUATIONS ****/
analog begin
$bound_step(20p);
TMP_I = I(anode,cathode);
TMP_LIGHT = Pwr(INT_LIGHT);
B.2 VCSEL 189
TMP_C = Q(INT_C);
if (TMP_LIGHT<0)
TMP_LIGHT=0;
if (TMP_C<0)
TMP_C=0;
if (analysis("static") && T_op>=200) begin
TMP_T = T_op;
R_thermZ = 0;
T_ambientZ = T_op;
end
else begin
R_thermZ = R_therm;
T_ambientZ = T_ambient;
if (Temp(INT_T)<200)
TMP_T = 200;
else
TMP_T = Temp(INT_T);
end
V_diode = N_diode*$vt(TMP_T)*ln(TMP_C/Ce_diode+1);
if (analysis("static")) begin
TMP_Va = V_diode+TMP_I*Ra;
TMP_Vp = TMP_Va+TMP_I*Rs;
I_diode = TMP_I;
Ia = TMP_I;
Ip = 0;
RESULT_Va = TMP_Va;
RESULT_Vp = TMP_Vp;
end
else begin
TMP_Va = V(INT_Va);
TMP_Vp = V(INT_Vp);
I_diode = (TMP_Va-V_diode)/Ra;
Ia = (TMP_I*Rp+TMP_Vp-TMP_Va)/(Rp+Rs);
Ip = (TMP_I*Rs+TMP_Va-TMP_Vp)/(Rp+Rs);
RESULT_Va = idt((Ia-I_diode)/Ca,TMP_Va);
RESULT_Vp = idt(Ip/Cp,TMP_Vp);
end
dissipation = Rs*Ia*Ia+(V_diode+Ra*I_diode)*I_diode-TMP_LIGHT;
if (analysis("static"))
if (T_op<200)
RESULT_T = T_ambient+dissipation*R_therm;
else
RESULT_T = T_op;
else
190 Verilog-AMS simulation models
RESULT_T = idt((T_ambientZ+dissipation*R_thermZ-TMP_T)
/tau_therm,TMP_T);
c3 = c30*limexp(-Ea/(‘P_K*TMP_T));
/* QW characteristic carrier density (1/m^3) */
n0 = n_Ts*TMP_T + n_T0;
/* QW characteristic material gain (1/m) */
G0 = G_Ts*TMP_T + G_T0;
/* characteristic carrier charge */
Cn0 = n0*‘P_Q*Vol;
if (TMP_C<Cn0)
G = v_g*G0*(abs(TMP_C/(n0*‘P_Q*Vol))-1)/k;
else
G = v_g*G0*ln(abs(TMP_C/(n0*‘P_Q*Vol)))/k;
if (analysis("static"))
RESULT_LIGHT = l_C*TMP_C*TMP_C/(Rph-Gamma*l_LIGHT*G);
else
RESULT_LIGHT = idt(Gamma*l_LIGHT*G*TMP_LIGHT-Rph*TMP_LIGHT
+l_C*TMP_C*TMP_C,TMP_LIGHT);
RESULT_C = idt(eta_i*I_diode-((c3*TMP_C+c2)*TMP_C+c1)*TMP_C
-c_LIGHT*G*TMP_LIGHT);
RESULT_V = TMP_Va+Ia*Rs;
if (analysis("static")) begin
if (RESULT_C<0 || RESULT_LIGHT<0) begin
RESULT_C=2*Cn0;
RESULT_LIGHT=1e-6;
end
end
Pwr(INT_LIGHT) <+ abs(RESULT_LIGHT);
Q(INT_C) <+ abs(RESULT_C);
Temp(INT_T) <+ RESULT_T;
V(INT_Va) <+ RESULT_Va;
V(INT_Vp) <+ RESULT_Vp;
Pwr(light) <+ RESULT_LIGHT;
V(anode,cathode) <+ RESULT_V;
if (!analysis("static"))
@(final_step) begin
$strobe("Final photon density = %fE21/m^3", RESULT_LIGHT
/(1E21*k));
$strobe("Final carrier density = %fE21/m^3", RESULT_C
/(1E21*‘P_Q*Vol));
$strobe("Final active region temperature = %f K", RESULT_T);
$strobe("Final active region voltage = %f V", RESULT_Va);
$strobe("Final parasitics voltage = %f V", RESULT_Vp);
end
end
endmodule
Appendix C
Ray tracing code
This is C language code of the simple ray tracing model discussed in section
4.3.3.
#include <math.h>
/* Random number generation is performed by the MT19937 algorithm.
* See [Matsumoto and Nishimura, 1998]
* http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html
*/
#include "sfmt19937.h"
/* These are the first six Laguerre polynomials */
static double LaguerreL(unsigned int n, double x) {
switch (n) {
case 0:
return 1;
break;
case 1:
return -x+1;
break;
case 2:
return ((x-4)*x+2)/2;
break;
case 3:
return (((-x+9)*x-18)*x+6)/6;
break;
case 4:
return ((((x-16)*x+72)*x-96)*x+24)/24;
break;
case 5:
return (((((-x+25)*x-200)*x+600)*x-600)*x+120)/120;
break;
case 6:
return ((((((x-36)*x+450)*x-2400)*x+5400)*x-4320)*x+720)/720;
break;
default:
return 0; /* higher orders unsupported */
192 Ray tracing code
break;
}
}
/* This function calculates the coupling efficiency between a
* Laguerre-Gaussian beam LG(n,m) and a graded-index fiber. The minimal
* beam waist (at a perpendicular distance z from the fiber butt) is w0 and
* the wave number is k. The beam and fiber axes are assumed to be parallel,
* but misaligned by a distance r. The fiber core radius is rcore, with a
* central core refractive index ncore. The numerical aperture at the center
* of the fiber is NA. The parameter "resolution" is the required absolute
* accuracy of the result.
* The parameter "filterlevel" determines which rays to accept:
* 1: accept all rays entering the core
* 2: same as 1, but reject refracting rays
* 3: same as 2, but additionally reject tunneling rays
*
* The refraction of rays at the core entry is included, but the associated
* reflection loss is not taken into account (this can be catered for by a
* global scale factor).
*/
double LaguerreGaussianGrinCoupling(double w0, double k,
unsigned int n, unsigned int m,
double rcore, double ncore, double NA,
double r, double z,
double resolution, unsigned int filterlevel) {
unsigned long tests=0;
double Nt=0, St=0;
double sigma_r=0.5*w0;
double sigma_t=1/(k*w0);
double f0_xy=2.0/(w0*w0);
double f0_st=0.5*(k*k)*(w0*w0);
double f1_ct=2*k;
double rcoreSQ=rcore*rcore;
double n0SQg=(ncore*rcore/NA)*(ncore*rcore/NA);
double directionalscaling=1/ncore;
double resfac=4.0/(resolution*resolution);
unsigned long resmod=((unsigned long)(10.0/resolution));
double fac,rsq,v1,v2;
double rx,ry,tx,ty;
double f0, f1, weight;
double rxSQ, rySQ, txSQ, tySQ;
double tnormal;
double AxSQ_tnormal, AySQ_tnormal, AxSQ_AySQ;
double sin_phi_yx_over_k_Ax_Ay;
double p;
/* initialize the random number generator */
init_gen_rand(4357); /* we have fixed the seed for determinism */
while (1) {
193
/* Here we generate rays with initial position (rx,ry)
* and initial direction (tx,ty) (these are tangents).
*
* We apply the Wigner-transformed representation of Laguerre-Gaussian
* bundles to generate rays. For more information, we refer to
* Gase [1995]; Simon and Agarwal [2000]; Bastiaans and Alieva [2006].
*
* In practice, we always generate rays which result in a Gaussian beam
* profile if their weights are equal. By attributing unequal weights to
* the rays, any Laguerre-Gaussian profile can be generated.
*/
/* generate tx and ty using Box-Muller [1958] */
do {
v1=2.0*genrand_real3()-1.0; /* genrand_real3 is random in (0,1) */
v2=2.0*genrand_real3()-1.0;
rsq=v1*v1+v2*v2;
} while (rsq >= 1.0 || rsq == 0.0);
fac=sqrt(-2.0*log(rsq)/rsq);
tx=sigma_t*v1*fac;
ty=sigma_t*v2*fac;
/* generate rx and ry using Box-Muller */
do {
v1=2.0*genrand_real3()-1.0;
v2=2.0*genrand_real3()-1.0;
rsq=v1*v1+v2*v2;
} while (rsq >= 1.0 || rsq == 0.0);
fac=sqrt(-2.0*log(rsq)/rsq);
rx=sigma_r*v1*fac;
ry=sigma_r*v2*fac;
/* calculate weight of ray */
if (n==0 && m==0)
weight=1.0;
else {
f0=f0_xy*(rx*rx+ry*ry)+f0_st*(tx*tx+ty*ty);
f1=f1_ct*(rx*ty-ry*tx);
weight=LaguerreL(n,f0+f1)*LaguerreL(m,f0-f1);
}
/* incorporate height and displacement */
rx=rx+z*tx-r;
ry=ry+z*ty;
/* increment number of tests */
tests++;
Nt+=weight;
/* test core containment */
rxSQ = rx*rx;
rySQ = ry*ry;
if (rxSQ+rySQ<=rcoreSQ) {
if (filterlevel==1)
St+=weight; /* accept all rays entering the core */
else {
/* core entry occurs; apply Snellius */
tx=directionalscaling*tx;
txSQ = tx*tx;
ty=directionalscaling*ty;
194 Ray tracing code
tySQ = ty*ty;
/* check angular acceptance in fiber */
tnormal = txSQ+tySQ+1;
AxSQ_tnormal = ((n0SQg-rySQ)*txSQ+rxSQ*(1+tySQ));
AySQ_tnormal = ((n0SQg-rxSQ)*tySQ+rySQ*(1+txSQ));
AxSQ_AySQ = (AxSQ_tnormal+AySQ_tnormal)/tnormal;
switch (filterlevel) {
/* for the mathematics supporting classification of rays, we refer to
Ankiewicz and Pask [1977] */
case 2:
/* discard only refracting rays */
sin_phi_yx_over_k_Ax_Ay=ry*tx-rx*ty;
if (rcoreSQ*rcoreSQ-(AxSQ_AySQ)*rcoreSQ
+(n0SQg-AxSQ_AySQ)*(sin_phi_yx_over_k_Ax_Ay
*sin_phi_yx_over_k_Ax_Ay) >= 0)
St+=weight;
break;
case 3:
/* discard refracting and tunneling rays */
if (AxSQ_AySQ<=rcoreSQ)
St+=weight;
break;
default:
break;
}
}
}
/* stopping criterion */
if (tests%resmod==0) {
p=St/Nt;
if (((double)tests)-resfac*p*(1.0-p) >= 0)
return p;
}
}
}
Appendix D
Calculation details
D.1 Derivation of a stochastic model for array-wide
ber-coupled power
This section complements the discussion of the array-wide statistical modeling
of VCSEL-fiber coupled power of section 4.3.5.
We will now explore the joint distribution of OMA (i0, i1) and AVG (i0, i1)
by evaluating the mean and/or (co)variance of expressions containing these
variables. In some expressions appearing below (e.g., in equation D.1), this is
done in two stages.
A first step is to express all (co)variances in terms of expected values, e.g.,
var
[·] = E [·2] − E [·]2. Unconditional expectations are then evaluated
through an intermediate conditioning on A, as follows: E [·] = E [E [·|A]].
In this stage, random variables can only appear within an expression of the
form E [E [·|A]]. The innermost conditional expectation results in a random
variable which is a function of the alignment A. It can be directly calculated
from the distribution parameters of equation 4.8. Note that the components
of OMA (i0, i1) and AVG (i0, i1) are alignment-conditionally independent for
different array positions.
The unconditional outermost expected value is estimated using a Monte Carlo
approach: nMC (typically 104–105) subsequent joint mechanical alignments
are generated by collectively sampling the distributions shown in figure 4.14
on page 101. In the k-th Monte Carlo sample, the mechanical alignment is
translated into the vectors r1,k, . . . , rnd ,k—yielding numerical values for the
misalignment of each fiber-laser pair—and the conditional expectation E [·|A]
is calculated given these vectors. The averaged outcome of this calculation
over all nMC samples then yields the required estimate.
Now we will look into the behavior of the individual components of
196 Calculation details
OMA (i0, i1) and AVG (i0, i1) and the occurrence of statistical dependences
across different array positions. The expected intra-VCSEL variance at a given
array position d, ignoring misalignment uncertainty, is given by
E [var [OMAd (i0, i1)| A]] , (D.1)
with a similar equation for AVGd. It evaluates in the ranges 0.077 to 0.087 for
OMA and 0.025 to 0.35 for AVG (heavily dependent on the current setting).
The following expression is also a quadratic measure. It represents the devia-
tion of the coupled optical modulation power at the given position d from the
array average, ignoring VCSEL process variations:
E
[
(E [OMAd (i0, i1)− 〈OMA (i0, i1)〉| A])2
]
, (D.2)
where the angular brackets denote an averaging over all array positions. It
evaluates to 9.2 · 10−4 to 0.0049 for OMA and 9.5 · 10−5 to 0.0039 for AVG,
both heavily dependent on the considered array position.
We remark that even the highest observed value for equation D.2 (caused only
by global rotational misalignment and small fiber true position errors) is an
order of magnitude smaller than the lowest observed intra-VCSEL variance
(caused only by VCSEL variability). This means that the effect of nonuniform
misalignment is dwarfed by VCSEL process variations. Based on this obser-
vation, we can ignore any global rotational misalignment without affecting
accuracy much; the notion of global translational misalignment is preserved.
This yields a natural way of decomposing OMA (i0, i1) and AVG (i0, i1) as a
sum of a global alignment-related contribution and component-specific devia-
tions (caused by VCSEL variability and thus considered i.i.d. across different
array positions). A discussion on the distributions of the components of this
decomposition is in the main text, from equation 4.9 (page 100) onward.
D.2 Calculation of the yield of an array-wide low
BER when using a common logic threshold
In theis section, the expectation over the distribution of A and AVG0 in
equation 4.20 (page 114) is evaluated using a Monte Carlo method. In the
k-th iteration, a sample ak of A is drawn first, fixing the array-wide alignment-
related components in equations 4.9 and 4.10 (page 100). Given ak, a sample
avg0,k of the distribution of AVG0 is drawn next, representing the array-wide
derived logic threshold.
Given ak and avg0,k, we can now evaluate
P
[
BER of 1 channel < 10−12
∣∣∣ A = ak,AVG0 = avg0,k] . (D.3)
D.3 Improved pulse position estimate 197
Again we apply a Monte Carlo method by generating and examining a very
large number ( nd) of data channels. In the d-th iteration of the k-th top-
level iteration, a joint sample
(
omad,k, avgd,k
)
of the conditional distribution
of (OMAd,AVGd) on A = ak is drawn. Equation 4.14 (page 109) is now used
to derive a value for the amplitude noise σN,d,k given omad,k. The threshold
bias β[mW]d,k is calculated as avg
[mW]
0,k − avg
[mW]
d,k Substituting omad,k, σN,d,k and
βd,k into equation 4.15 (page 109) yields the estimated BER. After sufficient
iterations of the innermost Monte Carlo method, the ratio of the channels
where BER < 10−12 to the total number of channels considered, raised to the
power nd − 1, provides a value for the k-th top-level Monte Carlo iteration.
Averaging all such values then yields the desired yield value.
D.3 Improved pulse position estimate
When the variance σ2 of the additive noise is not negligible, equation 5.22
yields an asymptotically biased estimate of the pulse position. Here, we
describe a procedure to remove this bias. Firstly, σ2 is estimated. To this
end, the density fn (x) is estimated by performing a vertical scan at a time
instant where the signal is constant over a sufficiently wide time interval;
σ2 can be calculated from the estimation. Instead of making a cut at single
voltage v0, we now make two cuts at nearby voltages v1 and v2 so that
t′i (v1) ≈ t′i (v2) ≈ (ti (v2))− ti (v1)) / (v2 − v1). Equations 5.26 and 5.29 at v1
and v2 then lead to a system with four equations and four unknowns ti
(
vj
)
,
assuming that σ2 is known. After solving these equations, all the parameters
in equation 5.22 are known, and hence more accurate values for the pulse
position can be obtained. This gives the following result:
t1 (v1) + t2 (v1)
2
≈ Mˆ1 (v1) + C
Mˆ0 (v1)
(D.4)
t1 (v2) + t2 (v2)
2
≈ Mˆ1 (v2) + C
Mˆ0 (v2)
, (D.5)
with
C =
σ2∆Mˆ0
[
Mˆ1 (v1) Mˆ0 (v2)− Mˆ1 (v2) Mˆ0 (v1)
]
Mˆ0 (v1) Mˆ0 (v2) (∆v)
2 − σ2 (∆Mˆ0)2 , (D.6)
and where ∆ denotes the difference of a quantity between v2 and v1.
198 Calculation details
D.4 Pulse width and position estimate around the
peak
To estimate the pulse peak in the presence of both jitter and additive noise, it is
approximated by the parabola p (t) (equation 5.30). We solve this equation for
t1 (v) and t2 (v) and substitute the results into the right-hand side of equations
5.26 and 5.28. The additive-noise density fn (x) can be replaced either by the
observed empirical distribution or approximately by a Gaussian density with
the same variance. These equations concretize to:
Mˆ0 (v0, t)
≈
∫ ∞
−∞
fn (x)
2
√
vtop − v0 − x
d
dx (D.7)
Mˆ1 (v0, t)
≈
∫ ∞
−∞
fn (x)
(
ttop +
√
vtop−v0−x
d
)2
2
dx
−
∫ ∞
−∞
fn (x)
(
ttop −
√
vtop−v0−x
d
)2
2
dx (D.8)
= ttop
∫ ∞
−∞
fn (x)
2
√
vtop − v0 − x
d
dx (D.9)
≈ ttopMˆ0 (v0, t) (D.10)
We again perform two cuts at two different heights v1 and v2 near the top,
which yields a set of four equations. This set is redundant, as both cuts should
yield the same location of the interval midpoint, because the parabola has a
vertical axis of symmetry. Three of the equations can be numerically solved
for ttop, vtop and d. An accurate estimate of the precise pulse peak follows.
