Last Two Surface Range Detector for Direct Detection Multisurface Flash Lidar in 90nm CMOS Technology by Preston, Douglas
Wright State University 
CORE Scholar 
Browse all Theses and Dissertations Theses and Dissertations 
2017 
Last Two Surface Range Detector for Direct Detection 
Multisurface Flash Lidar in 90nm CMOS Technology 
Douglas Preston 
Wright State University 
Follow this and additional works at: https://corescholar.libraries.wright.edu/etd_all 
 Part of the Electrical and Computer Engineering Commons 
Repository Citation 
Preston, Douglas, "Last Two Surface Range Detector for Direct Detection Multisurface Flash Lidar in 90nm 
CMOS Technology" (2017). Browse all Theses and Dissertations. 1829. 
https://corescholar.libraries.wright.edu/etd_all/1829 
This Thesis is brought to you for free and open access by the Theses and Dissertations at CORE Scholar. It has 
been accepted for inclusion in Browse all Theses and Dissertations by an authorized administrator of CORE 
Scholar. For more information, please contact library-corescholar@wright.edu. 
LAST TWO SURFACE RANGE DETECTOR FOR DIRECT 
DETECTION MULTISURFACE FLASH LIDAR IN 90nm CMOS 
TECHNOLOGY 
 
 
A thesis submitted in partial fulfillment 
of the requirements for the degree of 
Master of Science in Electrical Engineering 
 
By 
 
Douglas Preston 
B.S.E.P., Wright State University, 2014 
 
 
 
 
 
 
2017 
WRIGHT STATE UNIVERSITY 
WRIGHT STATE UNIVERSITY 
GRADUATE SCHOOL 
 
_May 12, 2017_ 
I HEREBY RECOMMEND THAT THE THESIS PREPARED UNDER MY 
SUPERVISION BY Douglas Preston ENTITLED “_Last Two Surface Range Detector 
for Direct Detection Multisurface Flash Lidar in 90nm CMOS Technology_” BE 
ACCEPTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE 
DEGREE OF Master of Science in Electrical Engineering 
___________________________ 
Saiyu Ren, Ph.D. 
Thesis Director 
 
___________________________ 
Arnab K. Shaw, Ph.D. 
Thesis Co-Director 
 
___________________________ 
Brian Rigling, Ph.D. 
Department Chair 
Committee on Final Examination 
 
___________________________ 
Arnab K. Shaw, Ph.D. 
 
___________________________ 
Saiyu Ren, Ph.D. 
 
___________________________ 
Raymond E. Siferd , Ph.D.  
 
___________________________ 
Robert E.W. Fyffe, Ph.D. 
Vice President for Research and 
Dean of the Graduate School 
iii 
Abstract 
 
Preston, Douglas. M.S.E.E., Department of Electrical Engineering, Wright State 
University, 2017. “Last Two Surface Range Detector for Direct Detection 
Multisurface Flash Lidar in 90nm CMOS Technology”  
 
 
 
This thesis explores a novel detection architecture for use in a Direct-Detect Flash 
LIDAR system. The proposed architecture implements detection of the last two 
surfaces within single pixels of a target scene. The novel, focal plane integrated 
detector design allows for detection of objects behind sparse and/or partially reflective 
covering such as forest canopy. The proposed detector would be duplicated and 
manufactured on-chip behind each avalanche photodiode within a focal plane array. 
Analog outputs are used to minimize interference from digital components on the 
analog input signal. The proposed architecture is a low-footprint solution which 
requires low computational post-processing. Additionally, constant fraction 
discrimination is used to mitigate range walk. 
The proposed architecture is designed in 90nm CMOS technology. The footprint is 
170.1 µm² with the largest transistor dimension being 22 µm. The design is easily 
expandable in hardware to allow additional surfaces to be detected. 
  
iv 
TABLE OF CONTENTS  
1 INTRODUCTION..................................................................................................1 
1.1 Background ................................................................................................................... 1 
1.2 Research Motivation ..................................................................................................... 2 
1.3 Thesis organization ....................................................................................................... 2 
2 THEORY ................................................................................................................3 
2.1 Analog Timing ............................................................................................................... 3 
2.2 Constant Fraction Discrimination ................................................................................. 5 
2.3 Shift Registers ............................................................................................................... 7 
2.4 Theory of Analysis ......................................................................................................... 9 
2.4.1 Timing Interval Measurement Uniformity ............................................................. 9 
2.4.2 Range Walk ........................................................................................................... 10 
2.4.3 Multi-Pulse Timing Separation Resolution and Separation Confusion ................ 11 
2.4.4 Process Variation .................................................................................................. 13 
3 LAST TWO SURFACE RANGE DETECTOR DESIGN ...............................15 
3.1 Introduction ................................................................................................................ 15 
3.2 Constant Fraction Discriminator ................................................................................. 22 
3.2.1 Differential Amplifier ............................................................................................ 24 
3.2.2 Buffer .................................................................................................................... 32 
3.2.3 Comparator .......................................................................................................... 34 
3.2.4 Delay ..................................................................................................................... 38 
3.2.5 Analog Optimization Process ............................................................................... 41 
3.2.6 CFD Logic and CFD Footprint ................................................................................ 49 
3.3 Timing Logic ................................................................................................................ 52 
3.4 Track and Hold Analog Timers .................................................................................... 56 
3.5 Conclusion ................................................................................................................... 62 
4 PERFORMANCE ANALYSIS ...........................................................................66 
4.1 Introduction ................................................................................................................ 66 
4.2 Timing Interval Uniformity .......................................................................................... 66 
4.3 Range Walk ................................................................................................................. 75 
4.4 Multi-Pulse Timing Separation Resolution and Separation Confusion ....................... 79 
4.5 Uncertainty as a Function of Noise ............................................................................. 82 
5 CONCLUSION AND FUTURE WORK ...........................................................86 
5.1 Conclusion ................................................................................................................... 86 
5.2 Future work ................................................................................................................. 89 
Appendix A .................................................................................................................92 
References .................................................................................................................96 
  
v 
LIST OF FIGURES 
 
Figure 1 -- TDC described by [12] ........................................................................... 3 
Figure 2 -- Analog Timing Mechanism Component ............................................... 4 
Figure 3 -- Schematic of a simple Constant Fraction Discriminator ....................... 6 
Figure 4 – Transient Function of a CFD .................................................................. 7 
Figure 5 – Idealized Serial-Input, Parallel-Output (SIPO) Shift Register consisting 
of three memory cells. ................................................................................... 8 
Figure 6 – Schematic for a 3-bit SIPO shift register implemented using D 
flip-flops with reset functionality. .................................................................. 9 
Figure 7 – Simulated range image of a civilian truck beneath a tree. ................. 16 
Figure 8 – Simulated return signal generated using the range image in Figure 7.
...................................................................................................................... 17 
Figure 9 – Frequency content of the signal shown in Figure 8 ............................ 18 
Figure 10 – High Level Last Two Surface Range Detector Schematic................... 20 
Figure 11 – Example Waveforms for the Last Two Surface Range Detector ........ 21 
Figure 12 – Final High-Level Constant Fraction Discriminator Schematic ........... 23 
Figure 13 – Single-Ended Output, Differential Input CMOS Amplifier; basic 
schematic ..................................................................................................... 24 
Figure 14 – Initial Differential Amplifier DC sweep .............................................. 28 
Figure 15 – Analog Buffer; basic schematic. ........................................................ 32 
Figure 16 – Buffer Amplifier Bode Plot ................................................................ 33 
Figure 17 – Buffer Bode Plot and Group Delay .................................................... 34 
Figure 18 – Asynchronous Comparator; basic schematic. ................................... 34 
vi 
Figure 19 – Frequency response for comparator using optimized differential 
amplifier. ...................................................................................................... 36 
Figure 20 – Comparator transient response using optimized differential amplifier
...................................................................................................................... 37 
Figure 21 – All-pass filter schematic .................................................................... 38 
Figure 22 – Amplitude and Group Delay Frequency Responses for the All-Pass 
Filter ............................................................................................................. 40 
Figure 23 – Analog and analog-connected components in the CFD .................... 41 
Figure 24 – CFD Analog Components Transient Response Plot ........................... 42 
Figure 25 – CFD Iterative Length Design .............................................................. 43 
Figure 26 – Process Corner Analysis .................................................................... 46 
Figure 27 – System Bandwidth ............................................................................. 47 
Figure 28 – CFD Logic Schematic.......................................................................... 49 
Figure 29 – AND-type CFD transient response for an input series of closely 
separated Gaussian pulses with varying separation. ................................... 50 
Figure 30 – Timing Logic Schematic. .................................................................... 52 
Figure 31 – Idealized Timing Logic ....................................................................... 53 
Figure 32 – Timing Logic Schematic in Feedback Mode. ..................................... 54 
Figure 33 – 3-Branch Analog Timing Mechanism ................................................ 57 
Figure 34 – Effects of Transmission Gate Capacitive Load on Charge Injection. . 59 
Figure 35 – Sample Transient Response for the Last Two Surface Range Detector
...................................................................................................................... 64 
Figure 36 – Input and Output for Several Single-Pulse Simulations .................... 67 
Figure 37 – Timing Uniformity TT Process Corner with Input Ramp Residual ..... 68 
vii 
Figure 38 – Timing Uniformity TT Process Corner with Linear Fit Residual ......... 69 
Figure 39 – Timing Uniformity SS Process Corner with Linear Fit Residual ......... 70 
Figure 40 – Timing Uniformity FF Process Corner with Linear Fit Residual ......... 71 
Figure 41 – Two Pulse Separation Uniformity and Single-Pulse vs. Two-Pulse 
Residual ........................................................................................................ 74 
Figure 42 – Range Walk in Volts for Three Process Corners ................................ 76 
Figure 43 – Range Walk in Nanoseconds for Three Process Corners .................. 77 
Figure 44 – Two Pulse Separation Confusion for TT Process Corner ................... 80 
Figure 45 – Example transient simulation for single 2.5ns FWHM Gaussian input 
pulse with AWGN ......................................................................................... 84 
Figure 46 – Timestamp Uncertainty Due to Input Signal Noise ........................... 85 
Figure 47 – Single-Ended Output, Differential Input CMOS Amplifier ................. 92 
Figure 48 – DFFwRST0: D-Type Flip Flop with Reset-to-Zero Functionality. ........ 93 
Figure 49 – RS Latch. Total Transistor Area: 0.288µm² ........................................ 93 
Figure 50 – NAND Gate ........................................................................................ 93 
Figure 51 – NOR Gate ........................................................................................... 94 
Figure 52 – NOT Gate ........................................................................................... 94 
Figure 53 – Transmission Gate ............................................................................. 95 
 
  
viii 
LIST OF TABLES 
 
Table 1 – Initial Differential Amplifier Design and Performance.......................... 27 
Table 2 – Buffer Differential Amplifier Design Parameters .................................. 30 
Table 3 – Comparator Differential Amplifier Design Parameters ......................... 30 
Table 4 – Comparator Design Parameters ............................................................ 35 
Table 5 – Analog Delay Design Parameters .......................................................... 39 
Table 6 – Summary of Figure 26 Results for the Differentiating Comparator 
Output .......................................................................................................... 45 
Table 7 – Summary of Figure 27 Results .............................................................. 47 
Table 8 – AND Gate Logic ..................................................................................... 51 
Table 9 – Expanded AND Gate Logic .................................................................... 51 
Table 10 – CFD Logic ............................................................................................. 51 
Table 11 – Minimum Footprint Estimate for CFD ................................................ 51 
Table 12 – Minimum Footprint Estimate for Timing Logic ................................... 55 
Table 13 – Summary of Input Ramp and Output Voltage Swing ......................... 60 
Table 14 – Rise and Fall Times for Input Voltage Transitions between 300mV and 
800mV for Analog Timing Branches ............................................................. 61 
Table 15 – Minimum Footprint Estimate for the Analog Timing Mechanism ...... 62 
Table 16 – Minimum Footprint Estimate for the Proposed Detector .................. 65 
Table 17 – Summary of Timing Interval Uniformity Results ................................ 72 
Table 18 – Summary of Range Walk Results ........................................................ 77 
Table 19 – Summary of Multi-Pulse Separation Simulation Results .................... 81 
Table 20 – List of Schematics in Appendix A ........................................................ 92 
ix 
Acknowledgement 
 
I would like to thank my thesis advisors Dr. Arnab Shaw and Dr. Saiyu Ren for 
their support. Thanks to Dr. Shaw for introducing me to this project and to the people 
of WPAFB. Thanks to Dr. Ren for her caring and careful guidance. Additional thanks 
goes to Dr. Raymond Siferd. 
I would like to thank the technical advisors at WPAFB with special thanks to 
Robert Muse and Dr. David Rabb. Additional thanks goes out to Dr. Matthew Dierking; 
may he enjoy retirement. 
I would like to thank my friends and colleagues from WSU and WPAFB. 
Finally, I thank my family for their support throughout this and future endeavors.
1 
1 INTRODUCTION 
1.1 Background 
Flash lidar refers to a type of Time of Flight (ToF) camera. In a flash lidar system a 
pulse (or flash) of light, typically infrared, is generated at the detector and allowed to 
propagate towards and interact with the environment to be sensed. Some light from the 
original pulse is reflected off the environment back towards the detector. The reflected 
light is collected by optics onto a 2D array of photodiodes, called the focal plane. The 
delay between the initial output pulse and the pulses in the return signal for each 
photodiode is used to generate a range image of the sensed environment; ranges being 
determined based on the speed of light [1] [2]. The information from the range image 
can be used to generate a 3D model of the environment for use in various applications. 
Lidar has been used in a number of remote sensing applications. Applications are 
diverse and include precision remote sensing for unmanned vehicles in terrestrial and 
planetary environments [3], assistance in manned piloting applications [4], and detailed 
urban and forestry mapping [5]. 
Advancements have been made in scanning lidar to allow high speed sensing of the 
environment for applications in autonomous urban vehicles [4]; however, aquiring 
detailed range images using laser scanning requires sensing time on the order of 
minutes [5] and is unsuitable for real-time applications. Full-waveform lidar has been 
used in satellite-based applications in the mapping of forests as well as in bathymetry [6] 
[7] [8] [9]. Full-waveform lidar has the capability of producing high-resolution 3D 
images [6] [8]; however, full-waveform techniques generate large amounts of data 
which requires significant time for post-processing. Direct-detect flash lidar systems, 
also called range cameras, are typically very fast with sensing times on the order of less 
than a second [2] [10] [11]. However, the speed of direct-detect flash lidar comes at the 
2 
cost of discarding much of the information obtained using other lidar methods; systems 
of this type are typically implemented such that only the closest surface is detected 
while objects behind sparse covering such as forest canopy or other partially reflective 
surfaces are ignored by the detector. 
1.2 Research Motivation 
The goal of this research was to explore the feasibility of an architecture which 
expands on the capabilities of existing direct-detect flash lidar. Specifically, the focus 
was to develop fast, low-footprint, hardware-integrated, multi-surface detection 
capability; notably with the ability to detect the farthest two surfaces from the detector 
for each photodiode within the focal plane. Implementing the proposed circuit on the 
focal plane of a direct-detect flash lidar system would enable real-time detection of 
objects and surfaces behind sparse or partially reflective covering. 
1.3 Thesis organization 
This thesis is organized in Chapters, Sections and Subsections; each of the five 
chapters contains a number of Sections which in turn may contain a number of relevant 
Subsections. Chapters are as follows: Chapter 1, provides a basic description of the 
state-of-the-art as well as some background on this work in layman’s terms; Chapter 2, 
covers relevant background theory needed to understand the work including basic 
design components and analysis techniques; Chapter 3, details the design of the circuit 
proposed by this thesis; Chapter 4, describes the methodology for analysis of the 
performance of the proposed circuit and the results of analyses conducted are presented 
therein; Chapter 5, contains a synopsis on the completed work as well as a look at 
possible future extensions of this work. 
 
3 
2 THEORY 
2.1 Analog Timing 
A Time to Digital Converter (TDC) is a component which transforms the timing of 
a continuous-time event into digital data. A straightforward analog TDC method 
involves representation of the timing interval by a linearly increasing voltage which is 
then converted to digital domain data by an ADC [12]. 
 
Figure 1 -- TDC described by [12] 
Start and stop signals are shown in addition to Veq, which represents the timing interval as a 
voltage. Image from [12]. 
 
For applications such as flash lidar where event signaling (start and stop in Figure 
1) is generated within a pixel on a focal plane and low footprint is desired, the ADC 
can be located off-chip. Additionally, the pulse generator and integrator shown in 
Figure 1 can be replaced with a track and hold component sampling from a ramp 
function. The ramp can also be generated off-chip and, in a multi-pixel system, input 
to all pixels for use in the timing mechanism. 
4 
 
Figure 2 -- Analog Timing Mechanism Component 
C (compliment of C’), the control signal, is normally voltage high (transmission gate conducting) 
and changes to voltage low (transmission gate open) in response to a timing event; CH is a holding 
capacitance. 
 
The timing mechanism shown takes a digital event as input (The control signal in 
Figure 2) and outputs an analog voltage timestamp of the event. Before a timing 
event, the control signal is voltage high and the input ramp signal propagates 
through the transmission gate to the output. When a timing event occurs, the control 
signal switches from voltage high to voltage low. Since the transmission gate is now 
open, the holding capacitor, CH maintains a voltage at the output. Since the input 
signal voltage is linearly related to time (a ramp) the output voltage after a timing 
event represents the time at which the event occurred; in other words, the output 
voltage is a timestamp. 
It is possible to overwrite a timestamp by changing the control signal such that 
the transmission gate changes from open to closed; this would allow the input ramp 
signal voltage to charge the capacitor. After a rise time is determined in part by the 
RC time constant of the circuit, the voltage at the output will be approximately equal 
to the voltage of the input ramp signal. After the rise time, a new timestamp can be 
generated. The minimum required rise time is in part determined by the level of 
acceptable timing uncertainty. 
5 
In a practical circuit, the input ramp signal is limited to a voltage range less than 
that of the power supply voltages. During a timing interval, the ramp signal starts at a 
minimum voltage and then increases linearly with time until the end of the timing 
interval, at which point the ramp reaches maximum voltage. At the end of the timing 
interval, the timing ramp is reset to the minimum voltage and another timing interval 
can begin. The timing represented by a timestamp is relative to other times within 
the timing interval during which it was generated; this includes the start and stop 
times of the timing interval itself. 
2.2 Constant Fraction Discrimination 
The function of a constant fraction discriminator (CFD) in this thesis is to 
transform analog pulse peak events into digital events. The CFD accomplishes this by 
subtracting a delayed version of the input signal from itself; this approximates 
differentiation without also amplifying high-frequency noise. In contrast to a CFD, a 
simple threshold detector introduces range walk, also known as time-walk, because a 
change in amplitude for the input signal changes the timing of the output digital 
event. 
Development and analysis of CFDs have been conducted for decades, especially 
in the field of nuclear science for particle detection [13] [14] [15] [16] [17] [18] [19]. 
 Observing Figure 3, the upper comparator approximates differentiation using a 
small delay while the lower comparator (also called the arming comparator of the 
CFD) only allows the output to go high when the signal amplitude is above a 
threshold voltage; the arming comparator prevents the circuit from triggering on 
low-amplitude noise. 
6 
 
Figure 3 -- Schematic of a simple Constant Fraction Discriminator 
 
Modern implementations of CFD circuits often use an all-pass filter as an analog 
delay [15] [16] [18] [19]. For CFDs that use an all-pass filter as the delay, it is 
important that the filter design have uniform group delay for all frequencies in the 
delayed signal; if the filter does not have uniform group delay, the signal shape will 
be distorted and may result in poor amplitude independence of the timing output for 
the CFD. For single-pole low-pass filters, amplitude response is nearly uniform for 
frequencies less than the corner frequency and group delay is uniform to a good 
approximation for frequencies much less than the corner frequency [19] [20]. 
 
7 
 
Figure 4 – Transient Function of a CFD 
Left: Unit Amplitude Input; Right: 1/3 Unit Amplitude Input; Note that the timing of the output 
rising edge for the CFD does not vary with input pulse amplitude. 
 
Observe in Figure 4 that the timing of the output rising edge of an ideal CFD in 
the noiseless case for a Gaussian pulse input does not change with respect to the 
input pulse amplitude. In contrast, the output rising edge for a simple threshold 
result is not invariant with respect to changes in input pulse amplitude. This concept 
is further defined in Subsection 2.4.2 on Range Walk. 
2.3 Shift Registers 
It is beneficial to readers to have some background knowledge on shift registers 
in understanding the control logic for the proposed design. Resources outside this 
thesis exist for aiding in the understanding of function and applications of shift 
registers [21]. 
Shift registers are generally used to convert data between sequential and parallel 
8 
formats or to act as memory elements. Several types of shift registers exist including 
Serial-Input Serial-Output (SISO), Serial-Input Parallel-Output (SIPO), Parallel-Input 
Serial-Output (PISO), Parallel-Input Parallel-Output (PIPO), and circular shift registers. 
Shift registers can be used in applications where large numbers of input or output 
device components must share a small number of serial I/O ports.  
Circular shift registers are a special case of Linear Feedback Shift Registers (LFSR) 
and are implemented by shifting the contents of the last memory cell of a shift 
register into the first cell at each clock cycle. Circular shift registers can be used to 
store repeated patterns or in the case of the Linear Feedback Shift Register (LFSR) 
generate a repeating pattern. Due to this capability, Circular shift registers and LFSRs 
can be used in the implementation of Finite State Machines (FSM). 
 
 
Figure 5 – Idealized Serial-Input, Parallel-Output (SIPO) Shift Register consisting of three 
memory cells. 
 
A shift register is so called because the datum in every memory cell shifts to the 
next cell at each clock cycle. Observing the idealized SIPO shift register in Figure 5, for 
example, at the next clock cycle the input datum will replace the datum at memory 
cell B1, the datum at B2 will be replaced by the one presently in B1, and the datum in 
9 
B2 will replace the one in memory cell B3. So for the state shown in Figure 5, if the 
input before the clock cycle was ‘1’ then the outputs after one clock cycle would be: 
B1=1, B2=0, B3=1. Figure 6 shows an example implementation of a SIPO shift register. 
 
 
Figure 6 – Schematic for a 3-bit SIPO shift register implemented using D flip-flops with reset 
functionality. 
 
As previously indicated, the shift register in Figure 6 can be made into a circular 
shift register by using the output of B3 as the Input. After setting the initial values for 
each memory cell, the circuit would output a repeating pattern to B1, B2, and B3. 
Such a circular shift register could be used as a frequency divide-by-3 circuit with the 
input signal being the clock. 
2.4 Theory of Analysis 
2.4.1 Timing Interval Measurement Uniformity 
In this document, timing interval measurement uniformity refers to the linearity 
of the relationship between the change in range from a lidar sensor to a surface and 
the output range measurement corresponding to that surface. For the ideal lidar 
system, a linear change in the range from the sensor to a surface within a target 
scene results in a linear change to the corresponding output range measurement. For 
a realized system, amplifier non-linearity as well as effects of limited amplifier 
10 
bandwidth result in a non-uniformity in measurements across the timing interval. 
Additionally in multi-pixel lidar systems, process variation causes changes in 
measurement non-uniformity from pixel to pixel. When analyzing performance of a 
multi-pixel lidar system, it is appropriate to consider timing interval measurement 
uniformity for both a single pixel and across all pixels. 
For any timing interval measurement non-uniformity that does not vary with 
time or input signal shape, the output range measurement can be calibrated with 
respect to actual target surface range; assuming ideal calibration, this would 
eliminate all non-uniformity in measurements across the timing interval. One 
practical method of performing calibration involves a uniform scheme of calibration 
across all pixels in a lidar system. In such a case, process variation is not accounted 
for and will affect the variation on the range measurement outputs of the lidar 
system from pixel to pixel. 
 
2.4.2 Range Walk 
In a lidar system, range walk refers to the sensitivity of the output range 
measurement of a system to a change in return pulse intensity. For any surface at a 
given range, an ideal lidar system will output a range measurement which is 
independent of factors other than the range between the lidar system and the 
surface. For a lidar system with range walk, the output range measurement is not 
only dependent on range from lidar system to surface, but also on the intensity of 
the return pulse. Return pulse amplitude is dependent on, among other things, the 
characteristics of a surface as well as the viewing angle and range to the surface. In 
real-world operation of a lidar system, return pulse amplitude can vary significantly 
11 
for objects at the same range. As a result, range walk is an important property of a 
lidar system. 
 
2.4.3 Multi-Pulse Timing Separation Resolution and Separation Confusion 
For a lidar system capable of detecting multiple surfaces within a single return 
waveform, it is important to consider the resolving capability of the detector. In other 
words, for such a lidar system, it is important to determine the minimum separation 
in range between two surfaces such that any less separation would result in a 
non-detection for one of the surfaces. Clarifying this definition for a multi-pixel 
system, separation resolution is considered within a single pixel and it may vary from 
pixel to pixel due to process variation. This is also referred to as dead time. 
For a peak-amplitude event detector, such as a constant fraction discriminator, it 
is possible to calculate the minimum theoretically resolvable separation between two 
Gaussian pulse events. Consider the sum of two Gaussian pulses, one centered at 
time t = 0, the other at t = t0: 
 
𝑓(𝑡) =  𝑒
−
(𝑡−𝑡0)
2
2𝜎2 +  𝑒
−
𝑡2
2𝜎2 
Eqn. 1 
where σ2 is the variance, which is related to the Full Width at Half Maximum 
(FWHM) of the pulse, and 𝑒 is Euler’s number. For simplicity, it is assumed that the 
two pulses are both of unit amplitude and have the same FWHM. 
In the resolvable case where t0 is large, Eqn. 1 has two local maxima; the maxima 
are located approximately at t = 0 and t = t0. In the unresolvable case, Eqn. 1 has one 
maxima at 𝑡 =
𝑡0
2
. As a result of this, the derivative of Eqn. 1 will have a slope less 
than or equal to zero at 𝑡 =
𝑡0
2
 if and only if Eqn. 1 has one peak. The slope of the 
12 
derivative of Eqn. 1 is as follows: 
 
𝑑2
𝑑𝑡2
[𝑓(𝑡)] = −
𝑒
−(𝑡−𝑡0)
2
2𝜎2 (−𝑡0
2+2𝑡0𝑡+𝜎
2−𝑡2)
𝜎4
−
𝑒
−(𝑡)2
2𝜎2 (𝜎2−𝑡2)
𝜎4
. 
Eqn. 2 
Evaluating Eqn. 2 at 𝑡 =
𝑡0
2
 yeilds: 
 
𝑑2
𝑑𝑡2
[𝑓(𝑡)]|
𝑡=
𝑡0
2
= 2 
𝑒
−(𝑡0)
2
8𝜎2 (
𝑡0
2
4
−𝜎2)
𝜎4
. 
Eqn. 3 
Setting Eqn. 3 equal to zero and solving for 𝑡0 yields the minimum resolvable 
separation, 𝑡𝑚𝑖𝑛, 
 
2 
𝑒
−(𝑡𝑚𝑖𝑛)
2
8𝜎2 (
𝑡𝑚𝑖𝑛
2
4
− 𝜎2)
𝜎4
= 0. 
Eqn. 4 
Considering that the separation is always positive, this can be simplified to: 
 𝑡𝑚𝑖𝑛 = 2𝜎. Eqn. 5 
Intuitively, this result makes sense considering that the inflection points for a 
Gaussian pulse are at 𝑡 = µ ±  𝜎, where µ is the center of the pulse. [22] 
The relationship between FWHM and σ for a Gaussian pulse is as follows: [23] 
 𝐹𝑊𝐻𝑀 = 2√2 𝑙𝑛(2)  𝜎 ≈ 2.355  𝜎 Eqn. 6 
where 𝑙𝑛(∙) denotes the natural logarithm function. 
Using the equivalence expressed in Eqn. 6, Eqn. 5 can be rewritten as follows: 
 
𝑡𝑚𝑖𝑛 = 2𝜎 =  
𝐹𝑊𝐻𝑀
√2 𝑙𝑛(2)
≈  0.849 ∙ 𝐹𝑊𝐻𝑀 
Eqn. 7 
In practice, the minimum resolvable separation will be larger than this due to 
noise and imperfections in the realized peak detector. 
To avoid any confusion, note that this bound is unrelated to the Rayleigh 
criterion of imaging which is based on the optical property of diffraction [24]. While 
13 
there is similarity between the Rayleigh criterion and Eqn. 7 in some respects, the 
two equations deal with different functions. 
 
2.4.4 Process Variation 
Process variation describes variations that arise due to imperfect die 
manufacture. These variations can affect all dimensions of a transistor including 
length and width. There are two common ways of modeling process variation: Monte 
Carlo Analysis and Corner Analysis. In Corner Analysis, the circuit is modeled at the 
extremes (the ‘corners’) of variation which can occur due to the manufacturing 
process based on the guarantee of the manufacturer. The goal of this form of analysis 
is typically that if the circuit is designed such that functionality is maintained at each 
process corner, there will be negligible likelihood that a faulty chip would be 
produced based on the guarantee of the chip manufacturer. 
For a multi-pixel range detecting lidar, pixel-to-pixel (P2P) process variation 
results in varying range measurements across pixels; as a result, it is necessary to 
consider this type of process variation in the present thesis. It is assumed in 
simulations conducted for this thesis that process variation among transistors within 
a single pixel is negligible. This is a reasonable assumption because one of the goals 
for the thesis is that the design be a low-footprint solution; as a result, all transistors 
within a single pixel are assumed to be sufficiently near each other on the die that 
significant process variation is avoided. Further details of this assumption are 
explained in Chapter 3, on design.  
Wafer-to-wafer (W2W) and die-to-die (D2D) process variation uniformly affect all 
pixels in a multi-pixel lidar system. This type of process variation is of less concern 
14 
than P2P variation because range measurements will be uniformly affected across 
pixels. As discussed in Section 2.4.1 on Timing Interval Measurement Uniformity, this 
type of systematic range error can be mitigated through simple calibration. 
Concerning variation that uniformly affects all transistors, any design conceived must 
retain functionality under industry standard process variations. The process variation 
considered in this thesis is based on the IBM 90nm process corner standard. 
Considerations on the effects of process variation are included in the subsections 
on other analyses.
15 
3 LAST TWO SURFACE RANGE DETECTOR DESIGN 
3.1 Introduction 
The goal for the proposed design is to detect and output a timestamp voltage for 
each of the last two pulses in an analog input signal. This analog input signal is a 
pulse train which may contain any number of Gaussian-esque pulses of varying 
amplitudes within the timing interval under consideration. The pulses within the 
signal may have varying separation, but a minimum separation is assumed between 
any two adjacent pulses. The input signal is also assumed to have a voltage swing 
compatible with the input buffer for the circuit. The allowed voltage range for the 
proposed design is 400mV – 800mV.  
A timestamp voltage is defined in this thesis as being a voltage which represents 
the time at which a pulse occurred in the analog input signal within the timing 
interval under consideration; specifically, the time at which the pulse reaches its 
peak voltage is timed. To further clarify terms, an analog pulse peak event refers to 
the time-voltage point at which a pulse within an analog signal is at its maximum 
amplitude. 
In developing a detector design, it is important to consider the input signal 
characteristics. The proposed detector is for use in a flash lidar system. Figure 7 
shows an example target scene for the lidar system. Figure 8 shows the return signal 
that results from one of the pixels in the target scene. In a lidar system implementing 
the proposed detector design, it is assumed that the representative return signal 
shown in Figure 8 would be output from the lidar photodiode and input to the 
detector as a voltage signal scaled to the voltage range of 400mV – 800mV. 
16 
 
Figure 7 – Simulated range image of a civilian truck beneath a tree. 
Grid lines indicate pixels. The ‘X’ marker near the center of the image indicates the pixel used to 
generate the signal shown in Figure 8. The marker and rectangle are only shown to aid the eye 
and do not affect the simulation. The rectangle highlights the position of the truck. Also note that 
the different colors/shades in the image represent different ranges; the truck is more distant than 
the tree leaves from the perspective of the observer. 
17 
 
Figure 8 – Simulated return signal generated using the range image in Figure 7. 
Note that all pulses are Gaussian-esque with pulse peaks near or at the line of symmetry for all 
pulses. Full width at half maximum for all pulses is 2.5 ns. The amplitude shown is in arbitrary 
units. 
 
The signal in Figure 8 contains all of the characteristic features of return signals 
which were crucial to consider in the development of the design proposed in this 
thesis. Earlier than 125 ns within the return signal, there are several return pulses 
associated with the tree in Figure 7. The precise timing of the pulses associated with 
the tree is of little interest when considering the objective of this thesis which, for 
the case shown in Figure 7 and Figure 8, is the detection of the vehicle beneath the 
18 
tree. Between 150 ns and 200 ns there is a pulse associated with the vehicle in Figure 
7 followed by a pulse associated with the ground; the precise timing of both of these 
pulses for all relevant pixels can aid in the detection and identification of the vehicle 
beneath the tree. 
The full width at half maximum (FWHM) for Gaussian pulses in the example 
signal was chosen to be 2.5 ns to ensure sufficient range resolution for objects within 
the sample target scene and because it was expected that a circuit could be designed 
with sufficient bandwidth for a 2.5 ns FWHM Gaussian pulse. The frequency 
spectrum of Figure 8 is shown in Figure 9. 
 
 
Figure 9 – Frequency content of the signal shown in Figure 8 
Note that the signal bandwidth is much less than 0.5 GHz. 
 
 
19 
As seen in Figure 9, the 3dB Bandwidth of a typical return pulse is less than 
250MHz and nearly all signal energy exists at less than 400MHz. The detector design 
must accommodate signals of this bandwidth. 
To accomplish the stated goal, three main processing stages are used. The first 
stage takes the analog signal from the photodiode as input and converts analog pulse 
peak events into digital rising-edge events. The second stage is control logic which 
acts to control the behavior of the third stage. The third stage converts digital timing 
events into analog voltage timestamps. With appropriate control logic, this achieves 
the design goal. 
The proposed solution is shown in Figure 10. The design was implemented in 
Cadence using IBM 90nm technology. For the first stage, a Constant Fraction 
Discriminator (CFD) is used; this is covered in detail in Section 3.2. The control logic 
of the second stage is shown as a block in Figure 10; details of the control logic are 
covered in Section 3.3, Timing Logic. The third stage analog timing mechanism is 
discussed in detail in Section 3.4, Track and Hold Analog Timers. 
20 
 
Figure 10 – High Level Last Two Surface Range Detector Schematic 
 
To aid in understanding the proposed design, Figure 11 shows an example set of 
inputs and outputs for one timing interval. As shown, there are five pulses in the 
example input signal for the timing interval considered. The peak events for each of 
the analog pulses are first converted into digital rising edge events by the CFD (CFD 
output is labeled as Vc in the plot). The digital events trigger timing measurements in 
the analog timing mechanism. At the end of the timing interval, the analog output 
voltages are read as timestamps. Two of the timestamps represent measurements 
for pulses while the remaining timestamp voltage is equal to the arbitrarily chosen 
non-measurement voltage. 
21 
 
Figure 11 – Example Waveforms for the Last Two Surface Range Detector 
The detector input is the topmost waveform. The CFD output is labeled Vc. Vout1, Vout2, and 
Vout3 are the three outputs of the detector; the transient waveforms for each is plotted. Timing 
Output 1 and Timing Output 2 are the output timestamp voltages. On the same plot of each 
detector output waveform, the output waveform for no detections is plotted in dashed light-grey 
to aid the eye. Vertical dashed lines appear across all voltages at each measurement time to aid 
the eye. 
 
The design of the control logic is determined by the goal of timing more than 
one pulse. There is more than one timestamp output; as a result, the control logic 
must direct the digital event signal to the appropriate timing component. This is 
conceptually similar to multiplexing. 
As seen in Figure 10, there are three timestamp voltage outputs (Labeled Vt1, 
Vt2, and Vt3) which result from one of three duplicate branches of the analog timing 
mechanism. Each of the three branches can be in one of two modes: Track or Hold. 
Output voltages for branches in holding mode are timestamp voltages for pulse peak 
22 
events. An output in tracking mode simply outputs the ramp signal voltage with 
some small gain applied. In this design one of the three branches must always be in 
tracking mode for reasons which will be elaborated on in Section 3.4. As a result, 
after each timing interval the voltages read out from the detector are two timestamp 
voltages in addition to a voltage which is equal to the input ramp signal voltage at 
the time of readout. It should be noted that the voltage of the input ramp signal at 
the time of readout does not affect the timestamps and can be chosen arbitrarily. 
With each new timing event, the oldest of the existing timestamps is overwritten by 
changing the mode of the corresponding Track and Hold branch to tracking mode; at 
the same time, a new timestamp is created. 
In conclusion, the proposed design shown in Figure 10 achieves the goal of 
outputting a timestamp for the last two pulses in an input analog pulse train. All 
schematics for the proposed design were implemented and simulated in Cadence 
using IBM 90nm technology. 
3.2 Constant Fraction Discriminator 
The Constant Fraction Discriminator (CFD) subsystem takes an input signal 
containing a series of analog Gaussian-esque pulse peak events and transforms them 
into a series of digital events; voltage timestamps are generated for these digital 
events in subsequent subsystems. The digital event outputs from the CFD are the 
rising edges of a rectangular pulse. The timing of these rising edges is, to a good 
approximation, related to the timing of the input analog events by only a constant 
delay; it is not affected by the amplitude of the analog events. 
23 
 
Figure 12 – Final High-Level Constant Fraction Discriminator Schematic 
 
Shown in Figure 12 is the high-level schematic of the CFD for the proposed 
circuit. For example waveforms demonstrating the ideal functionality of a CFD, see 
Figure 4 in Section 2.2. As seen in the figure, there are two comparators. Both 
comparators take analog signals as inputs and output digital signals. Additionally, 
both comparators are asynchronous. The upper comparator in the figure is the 
differentiating comparator; it approximates differentiation by subtracting Vin, the 
input signal, from a delayed version of itself. A small analog delay is accomplished 
using a low-pass, single-pole filter with a corner frequency larger than the bandwidth 
of the signal; this may also be called an all-pass filter. The lower comparator, the 
arming comparator, performs a simple threshold by comparing Vin against a 
threshold voltage, Vth. The arming comparator is so called because it prevents the 
circuit from triggering on low-amplitude noise; it only allows output from, or ‘arms’, 
the circuit when the threshold is met. The analog buffer shown in Figure 12 drives 
the analog delay and comparator inputs. It is necessary because the photodiode 
circuit (see Figure 10) which supplies the voltage input cannot supply much current 
without the input signal being significantly altered. The NAND gate and D flip-flop 
with zero-reset (DFFwRST0) combine the output from the differentiating and arming 
24 
comparators such that the final CFD output contains a digital rising edge event when 
and only when there is a pulse peak event at the input that satisfies the threshold; 
there is some delay between input and output. 
The buffer, analog delay, and comparators all contain analog components. The 
capacitive and resistive loads as well as the current supply requirements for each 
analog component are dependent on all other analog components connected to it. 
As a result, it was necessary to perform the final stages of the design of all four of 
these components in parallel in an iterative process. 
 
3.2.1 Differential Amplifier 
 
Figure 13 – Single-Ended Output, Differential Input CMOS Amplifier; basic schematic 
Each transistor T0 through T4 is defined by a length and a width parameter. T0 is the bias current 
transistor with Vbias being the bias voltage; T3 and T4 are the current-mirror load transistors; 
T1 and T2 are the amplifying transistors. 
 
A Single-Ended Output, Differential Input CMOS Amplifier [25] is the principle 
component for the comparators and buffers used in this thesis; it is referred to in this 
25 
thesis as a differential amplifier for brevity. The output for a differential amplifier is 
the subtraction of the two inputs, Vinp and Vinn, times the gain (Av). 
 Vout = Av • (Vinp − Vinn) Eqn. 8 
An asynchronous comparator can be constructed from a differential amplifier by 
using a large Av and allowing the output signal, Vout, to be clipped by the positive 
and negative voltage supply rails, Vdd and Vss. As long as Av is sufficiently large 
compared to the ratio between the supply rail voltage swing and the input voltage 
difference, Vout can be treated as a digital signal. This requirement is derived from 
Eqn. 8 and shown in Eqn. 10. Larger Av may be used if faster rise/fall time on the 
output digital signals is necessary. 
 
Av =  
Vout
Vinp − Vinn
 
Eqn. 9 
 
Av ≫  (
Vdd − Vss
Vinp − Vinn
) 
Eqn. 10 
Large Av can be accomplished by inputting Vout to additional gain stages. 
An analog buffer can be constructed from a differential amplifier by connecting 
Vinn to Vout. Vinp becomes the buffer input, Vin. Substituting into Eqn. 8 yields: 
 Vout = Av • (Vin − Vout) Eqn. 11 
Solving for Vout yields: 
 
Vout = (
Av
1 + Av
) • Vin 
Eqn. 12 
Where the buffer gain is: (
Av
1+Av
) 
Observing the equations above, in addition to consideration of where the 
differential amplifier will be used, the gain for the differential amplifier is not crucial 
to the design functionality of the Last Two Surface Range Detector. In total, there are 
26 
five buffers and two comparators in the Last Two Surface Range Detector; each uses 
a differential amplifier. For the timing mechanism (see Figure 10), the input timing 
ramp encounters a buffer, a transmission gate, another buffer, and then is output as 
a timestamp. The timestamps cannot be directly interpreted as range, but must 
instead be converted to range using a simple linear formula of the form: 
 R = M • Vt + B Eqn. 13 
Where R is the range from detector to target object; Vt is the timestamp voltage; 
B is a constant offset; and M is a constant multiplier which effectively converts the 
voltage to a travel time for the light pulse and then to a distance traveled assuming 
the speed of light is known and constant. Based on this, if the gain of the buffers is 
not perfectly unitary then M can be altered to compensate if necessary. Alternatively, 
the input timing signal can be altered to compensate for small amplifications and 
non-linear effects due to the buffers. The key requirement for these buffers is that 
they present a large input impedance to the timing ramp and that they have low 
output impedance in addition to being approximately linear for the bandwidth 
considered. For the buffer in the CFD, gain is again not a key requirement as long as it 
provides high input impedance and sufficiently low output impedance such that it 
can drive the subsequent analog components of the CFD. For the differential 
amplifiers in the comparators, the only requirement is that they provide an accurate 
subtraction. The gain of these amplifiers does not need to be large because 
subsequent gain stages can be added to achieve the functionality of a comparator. 
Overall the circuit will likely perform adequately if the differential amplifier gain is 
only on the order of a few decibels. 
A further requirement of the differential amplifiers is bandwidth. As discussed 
27 
previously in this chapter, the bandwidth for all components should be more than 
about 0.5 GHz so as to accommodate a typical return signal containing 
Gaussian-esque pulses with FWHM of 2.5ns. Design was conducted with the goal of 
achieving a -3dB bandwidth between 500MHz and 1GHz. 
Table 1 describes the initial differential amplifier design; transistor identifiers are 
with reference to Figure 13. Measurements for the initial design were taken with no 
capacitive load on the output and with ideal voltage inputs. All measurements were 
conducted on schematics in simulation using Cadence Virtuoso. Since 90nm CMOS 
technology is used in this thesis, Vdd = 1.2V and Vss = 0V. 
 
Table 1 – Initial Differential Amplifier 
Design and Performance 
Identifiers are referenced to Figure 13 
T0 (L; W) 400nm; 10.08µm 
Bias Voltage (Vbias) 485mV 
T1 & T2 (L; W) 400nm; 50.04µm 
T3 & T4 (L; W) 400nm; 6µm 
Low Freq. Gain (AV0) 23.1dB 
3dB Bandwidth (f3dB) 659MHz 
Phase Margin (PM) 38.3 degrees 
Low Freq. Group 
Delay (TG0) 
297ps 
95% Group Delay 
Frequency (f95%Tg) 
255MHz 
Group Delay at f-3dB 
(TG3dB) 
216ps 
 
Group delay for the differential amplifier was determined so as to evaluate the 
possibility of using it as an analog delay. Group delay is equal to signal delay to a 
good approximation under certain conditions [20]. 
The bias voltage for the initial design was chosen such that Vout=600mV when 
Vinp=Vinn=600mV; that is to say the design is such that the input offset voltage, 
Vinoffset, is 600mV.  
28 
 
Figure 14 – Initial Differential Amplifier DC sweep 
Two plots are shown: Vinoffset=Vinp=Vinn=400mV and Vinoffset=Vinp=Vinn=600mV 
 
Figure 14 is a DC voltage sweep created by holding the two inputs, Vinp and Vinn, 
equal while changing the bias voltage and measuring the output voltage. This creates 
a plot of potential DC operating points, also called Q-points, for the circuit. 
 As can be inferred from Figure 14 the differential amplifier is well designed for 
operation with Vinoffset=Voutoffset=600mV. This indicates that the differential amplifier 
would be well-applied as a buffer for signals with a 400mV-800mV swing because 
such a voltage swing is centered on 600mV. However as indicated by Figure 14, the 
initial differential amplifier design will not function well when Vinoffset=400mV and 
Voutoffset=600mV as would be desired for a differential amplifier operating with an 
29 
input voltage swing of 400mV-800mV and the input signals considered in this thesis. 
Some tweaking is needed to ensure that all transistors maintain saturation while the 
amplifier is operating with these parameters. 
 Observing the Figure 14 plot for Vinoffset=400mV, the output voltage does not 
drop to near-threshold-voltage levels in the same way that the Vinoffset=600mV plot 
does. One could conclude from this that there is not enough current flowing from 
the Vout node to Vss to keep all transistors in the differential amplifier in saturation. 
Furthermore, this implies that one potential solution would be to change the width 
of the bias transistor from 10.08µm to a larger width. It is unlikely that increasing the 
amplifying transistor widths would solve the problem since they are already much 
larger than the bias transistor width at 50.04µm each. In other words, the bias 
transistor seems to be the bottleneck for current flow in this scenario. 
 A voltage range of 400mV-800mV was chosen for the design. It was anticipated 
that using such a voltage swing would enable a design that would allow all amplifier 
transistors to remain in the saturation region of operation since the threshold voltage 
for all transistors in the designs used is near 300mV for NMOS and near -300mV for 
PMOS transistors. It is important that all transistors remain in the saturation region 
to guarantee amplifier linearity. 
 After optimization of the differential amplifier while integrated with other analog 
components, the final design parameters are as shown in Table 3; note that the 
voltage and transistor identifiers in the table are with reference to Figure 13. In the 
final design there was only one difference between the differential amplifier 
schematic for the buffer as compared to the one for the comparator; the bias 
transistor for the differential amplifier in the comparator was made larger to 
30 
accommodate an input offset voltage of 400mV. 
  
 
Table 2 – Buffer Differential Amplifier 
Design Parameters 
Identifiers are referenced to Figure 13 
T0 (L; W) 400nm; 6µm 
Bias Voltage (Vbias) 468mV 
T1 & T2 (L; W) 400nm; 22µm 
T3 & T4 (L; W) 400nm; 3µm 
Total Area 12.4µm² 
Low Freq. Gain (AV0) -2.29dB 
3dB Bandwidth (f3dB) 1.47GHz 
Low Freq. Group 
Delay (TG0) 
95ps 
105% Group Delay 
Frequency (f95%Tg) 
317MHz 
Input Capacitance 24.0fF 
 
 
 
  
 
Table 3 – Comparator Differential 
Amplifier Design Parameters 
Identifiers are referenced to Figure 13 
T0 (L; W) 400nm; 12µm 
Bias Voltage (Vbias) 468mV 
T1 & T2 (L; W) 400nm; 22µm 
T3 & T4 (L; W) 400nm; 3µm 
Total Area 14.8µm² 
Low Freq. Gain (AV0) 20.9dB 
3dB Bandwidth (f3dB) 1.18GHz 
Low Freq. Group 
Delay (TG0) 
164ps 
95% Group Delay 
Frequency (f95%Tg) 
623MHz 
Group Delay at f-3dB 
(TG3dB) 
126ps 
  
  
The gain for the buffer differential amplifier described in Table 2 is less than 0dB 
meaning that it attenuates the input signal. This results in an undefined phase 
margin because a closed loop system with a single feedback path of unity gain is 
stable when the open loop gain is attenuating [26]. This satisfies concerns about 
buffer stability. The Bode plot for the buffer is shown in Figure 16 in Subsection 3.2.2. 
The input capacitance listed in Table 2 was determined using Cadence 
simulations. 
The equation for current across a capacitor is 
 
Ic = C
dVc
dt
 
Eqn. 14 
Where C is capacitance and Vc is the voltage across the capacitor. 
Substituting an average for instantaneous current, this becomes: 
 
Ic_avg = C
ΔVc
Δt
 
Eqn. 15 
31 
Finally this is solved for capacitance: 
 
C = Ic_avg
Δt
ΔVc
 
Eqn. 16 
The input capacitance for the buffer was determined by supplying an ideal 
voltage ramp to the input while measuring the current flow into the input node. The 
capacitance from the input node to ground is then approximated using Eqn. 16 by 
plugging in the change in time over change in voltage for the ramp used in the 
simulation,
Δt
ΔVc
, and the average measured current into the buffer input, Ic_avg. 
 The lengths and widths in Table 2 and Table 3 allow the calculation of a lower 
bound for the footprint of each of the amplifiers described. The longest dimension is 
the width of the amplifying transistors at 22µm. It is likely, but not necessary, that 
the unit cell will be square since it is to be used for imaging. This implies that the unit 
cell will likely be at least 22µm by 22µm, or 484µm² total area, to accommodate the 
largest transistor width. This estimate is a lower bound since there will be some 
unaccounted space needed for connections and wiring which may increase the 
largest dimension. It is not only important to consider the longest dimension, but 
also the requisite total amount of area for all the components in the unit cell. Again 
ignoring area attributed by connections and wiring, the total areas shown in Table 2 
and Table 3 are calculated by simply adding the product of the dimensions for each 
transistor. One can conclude that several of each of these amplifiers could easily fit 
within a 22µm by 22µm footprint. 
32 
3.2.2 Buffer 
 
Figure 15 – Analog Buffer; basic schematic. 
Each transistor T0 through T4 is defined by a length and a width parameter. 
As suggested in the CFD subsection 3.2.1, the analog buffers are constructed 
from a differential amplifier by connecting Vinn to Vout. The Transistor lengths and 
widths are described in Table 2 in Subsection 3.2.1. The Bode plot of the open loop 
amplifier is shown in Figure 16. The Bode plot confirms very good stability for the 
buffer even if gain is added [26]. In this thesis, signal frequency content considered is 
less than 109 Hz. 
Group delay for the buffer was measured to determine viability as an analog 
delay. This will be discussed further in Subsection 3.2.4. 
33 
 
Figure 16 – Buffer Amplifier Bode Plot 
Gain plotted with solid line; Phase plotted with Dashed Line. Low frequency gain and cutoff 
frequency are marked on the gain plot. 
34 
 
Figure 17 – Buffer Bode Plot and Group Delay 
Top: Buffer Bode Plot; Solid – Gain with low frequency gain and cutoff frequency marked; Dashed 
– Phase. Bottom: Group Delay with low frequency group delay and group delay at 1GHz frequency 
marked. 
 
3.2.3 Comparator 
 
Figure 18 – Asynchronous Comparator; basic schematic. 
Each transistor T0 through T3 is defined by a length and a width parameter. T0 and T1 comprise 
an inverting amplifier as do T2 and T3. 
 
35 
As suggested in the CFD subsection 3.2.1, the asynchronous comparators were 
constructed using a differential amplifier with subsequent gain stages. Simple NOT 
gates are used for amplification of the signal output from the differential amplifier. 
The comparator has two inputs, one adding and one subtracting, and one output. If 
the sum of the adding input voltage and the negative of the subtracting input voltage 
is positive, then the output is equal to Vdd, the positive supply voltage. This is also 
called logic high or logic ‘1’. Otherwise, the output is Vss, also called logic low or logic 
‘0’. 
Table 4 – Comparator Design 
Parameters 
Identifiers are referenced to Figure 18 
NMOS T0 & T2 (L; W) 100n; 240n 
PMOS T1 & T3 (L; W) 100n; 390n 
Total Additional Area 0.126µm² 
  
 
The output from the NOT gates, and indeed the comparator, is digital. It was 
assumed that the minimum length and width for the transistors in digital 
components would be near-optimal [27]. Some iterative optimization was done on 
the PMOS transistor widths to reduce rise time. Rise time was considered because 
logic stages after the comparator are triggered on rising edges. The mobility of holes 
in silicon is typically lower than that of electrons [27]. Because of this, a PMOS 
transistor of the same length requires a larger width than an NMOS transistor in 
order to have the same current flow at a given gate voltage and drain-source voltage. 
Note that while 90nm CMOS technology is used, 240nm is the minimum transistor 
width while allowing space for contacts to a metal layer. 
36 
 
Figure 19 – Frequency response for comparator using optimized differential amplifier. 
Shown are the frequency responses for the differential amplifier only, as well as with one and 
both gain stages. 
 
As can be seen in Figure 19, each NOT gate inverting amplifier stage adds 16.7dB 
of gain. The final optimized comparator has a low-frequency gain of 58dB and a gain 
of 49dB at an operating frequency of 1GHz. If additional gain were required, an 
additional NOT gate could be catenated with the others at a very low additional 
footprint cost. Since this would invert the output, the positive and negative 
comparator inputs, Vinp and Vinn, would have to be swapped to maintain 
functionality of the CFD. 
37 
 
Figure 20 – Comparator transient response using optimized differential amplifier 
Input offsets for the differential amplifier are 400mV; Vinp=10mVp-p 10GHz sinusoid. 
 
As seen in Figure 20, a 10mVp-p 10GHz sinusoidal differential input signal is 
converted to a digital signal which is functional for the purposes of the subsequent 
logic stages. This is a sufficient design considering that the input signal will only 
contain signals with a bandwidth less than 1 GHz. 
 
38 
3.2.4 Delay 
  
Figure 21 – All-pass filter schematic 
Figure Left: Ideal schematic; Figure Right: Schematic as it appears in Cadence. Note that the ncap 
varactor requires a gate voltage. 
The analog delay used in the final schematic was a single-pole low-pass filter 
with a 3dB bandwidth larger than the input signal bandwidth, also called an all-pass 
filter. A key requirement for the analog delay was that the signal shape not be 
distorted; otherwise, CFD functionality is not guaranteed and significant range walk 
may be observed at the signal output. As a result, it was necessary to consider the 
group delay (Tg) of the filter. Filters with non-uniform group delay over the 
bandwidth of a signal cause the shape of the signal to be distorted. In addition, for 
filters with uniform gain and group delay over frequencies contained in a signal, the 
signal delay is approximately equal to the group delay [20]. 
Based on simulations involving Additive White Gaussian Noise (AWGN) run in 
Cadence and Matlab using idealized components, it was determined that the delay 
need not be very large to ensure CFD functionality. The delay used in the CFD was 
290 picoseconds. Larger delays of up to half the FWHM for Gaussian-esque pulses 
improve the function of the circuit at low SNR, but come at the cost of increased 
footprint due to the increased capacitor size in the all-pass filter. In the ideal 
noiseless case the analog delay within the CFD need only be infinitesimal; this is 
39 
identical to taking a derivative assuming that the signal gain is corrected. 
An additional advantage of the CFD is that the delay time, and therefore the RC 
time constant of the single-pole low-pass filter, need not be precise to preserve 
functionality. Any change in the delay time for the analog delay component will 
result in half as much change for the CFD output signal delay. Since this is a constant 
delay applied across the output signal and since a small change in component 
behavior results in a smaller change in output delay, the delay need not be a 
precision value. However, as previously noted, the delay value will affect 
performance at low SNR. 
For the design in Figure 21, a p-type polysilicon resistor and an ncap varactor 
with constant gate voltage equal to Vss were used. Table 5 shows the design 
parameter values. 
Table 5 – Analog Delay Design Parameters 
Identifiers are referenced to Figure 21 
R 2 kΩ 
R (L; W) 1.695µm; 340nm 
C  87fF 
C (L; W)  1µm; 10µm 
Total Area 10.6µm² 
Low Frequency Gain (Av0) 0dB 
3dB Bandwidth 468MHz 
Low Frequency Group 
Delay (Tg0) 
0.290 ns 
95% Group Delay 
Frequency (fTg_95%) 
135MHz 
Group Delay at 3dB cutoff 
Frequency 
0.146 ns 
  
  
Summarizing Table 5, the single-pole all-pass filter requires slightly less area than the 
buffer used in this thesis assuming similar area for wiring and connections. However, 
roughly three times as much delay is achieved using the all-pass filter. The group 
delay is not as uniform as initially presumed to be required for the CFD [20]; the fTg_95% 
40 
is 135MHz. However, the values shown in the table result in a functional CFD; it is 
observed that Tg remains larger than half of Tg0 at frequencies less than the 3dB 
cutoff. Group delay and Gain for the analog delay component are plotted in Figure 
22. 
 
Figure 22 – Amplitude and Group Delay Frequency Responses for the All-Pass Filter 
Simulated in Cadence with all other CFD analog components connected (See Figure 23). Solid – 
Gain; Dashed – Group Delay 
 
 
 
 
 
 
41 
3.2.5 Analog Optimization Process  
 
 
Figure 23 – Analog and analog-connected components in the CFD 
Note that Vbias and all transistor lengths and widths are design parameters in addition to 
parameters of the capacitor and resistor. 
 
For design simplicity Vbias, current mirror transistor widths (T3 & T4 in Figure 
13), amplifying transistor widths (T1 & T2 in Figure 13), and transistor lengths were 
constrained to be same across all differential amplifiers. One benefit that results 
from this is that only one DC bias voltage needs to be generated and the same bias 
voltage can be distributed to all unit cells of a focal plane. A greater level of 
optimization could be achieved by allowing more variation in the design parameters, 
but goes beyond the exploratory scope of this thesis. 
Transistor widths and lengths were changed iteratively and the effects of this on 
system bandwidth and performance were noted. The goal of this optimization 
process was to find a functional CFD with the smallest footprint. Corner analysis for 
the CFD was also performed in Cadence at this stage to ensure that functionality was 
maintained at CMOS 90nm process corners. 
42 
 
Figure 24 – CFD Analog Components Transient Response Plot 
Top: Solid – CFD Input; Dashed – Analog Delay input and output; Dotted Horizontal – Threshold 
voltage (Vth = 440mV). Middle: Dashed – Differentiating amplifier output; Solid – Differentiating 
comparator output; Dotted Horizontal – 600mV. Bottom: Dashed – Arming amplifier output; Solid 
– Arming comparator output; Dotted Horizontal – 600mV. 
 
Figure 24 shows the input and output transient signals for each of the analog 
components shown in Figure 23 for a typical CFD input signal. Included in the figure 
are the differential amplifier outputs for both the differentiating and arming 
comparators before and after asynchronous conversion from analog into digital 
signals. Figure 24 was generated using component values for the final version of the 
CFD design as described in Table 2, Table 3, Table 4, and Table 5. The plots in Figure 
25 and Figure 26 are laid out in a similar fashion, but show the effects of variation on 
43 
design parameters. 
 
Figure 25 – CFD Iterative Length Design 
Top: CFD input and threshold voltage. Middle: Differentiating amplifier output. Bottom: Arming 
amplifier output. Several transient outputs are plotted, each the result of a different length 
variation. 
 
Figure 25 shows the effects of varying the length of the differential amplifier 
transistors. Multiple transient responses are plotted on the same graph to highlight 
the effects of varying transistor lengths. The lengths were varied from 200nm to 
600nm in increments of 100nm. As can be seen, one effect of varying the length is 
variation in the output offset voltage. Additionally, it is observed that increasing the 
transistor lengths causes the slope of the outputs to increase at key points of interest. 
Such increases in slope due to increase in transistor length can be seen at time = 
10ns and time = 32ns in the arming amplifier output and well as at time=36ns in the 
differentiating amplifier output in Figure 25. Achieving steep rising edge slopes in the 
44 
outputs of these components is advantageous because logic components which 
accept input from the CFD trigger on rising edges; having a steeper slope will 
decrease uncertainty in timing measurements thus improving performance. 
Observed at time = 15ns in the differentiating amplifier output in Figure 25, use of 
400nm – 600nm transistor lengths cause some distortion in the output signal as 
compared to the ideal differential output. This is caused by transistors in the 
amplifier leaving the saturation region at low amplifier output voltages. Significant 
distortion to this output signal can cause changes in the time at which the signal 
crosses 600mV; this results in timing uncertainty and can cause poor detector 
performance. Accordingly, the final length of 400nm was chosen. The distortion 
previously mentioned was considered small enough at this length so as  not to 
significantly affect the final timing measurements; additionally, it is concluded from 
observations on Figure 25 that this length choice results in more complete use of the 
output voltage range for the amplifiers which results in steeper output signal rising 
edge slopes as compared to shorter transistor lengths. One improvement that could 
be made in future work is the mitigation of distortion resulting from the 400nm 
transistor length choice. This could be done by increasing the output offset voltage of 
the differentiating amplifier in an attempt to maintain saturation in transistors at low 
output voltages. 
A design process similar to the one described for choosing the transistor lengths 
was used in determining all other design parameters; transient functionality and 
system bandwidth were considered. Once the values for the parameters were chosen, 
the process was iterated in an attempt to reduce lengths and widths. Ideally, after a 
large number of iterations, such a design process should find the minimum 
45 
functional footprint for the architecture considered. Finding the minimum functional 
footprint was outside the scope of this thesis; however, a low-footprint design was 
achieved using the described process. Only the process in determining transistor 
lengths is described in detail in the interest of brevity. 
Figure 26 shows the transient response of the analog components of the CFD; 
multiple responses are shown on the same graph to highlight the effects of process 
variation. As seen, the system maintains functionality at all process corners. The 
rising edge of the arming comparator output is exceptionally invariant with 
deviations in the process; however, the rising edge of the differentiating comparator 
output is of primary interest concerning timing uncertainty. The most significant 
change in timing of a digital output as compared to the TT process corner is seen in 
the output of the differentiating comparator in the SS process corner case. Observed 
deviation is 0.5ns which would equate to one-quarter foot in range or about 8cm in a 
lidar system using the proposed design. The deviation observed in the timing 
measurements will primarily affect timing (and therefore range) uncertainty from 
pixel to pixel and chip to chip. Timing uncertainty within a pixel that arises in the CFD 
due to process variation is negligible for considerations in this thesis. As required, the 
CFD analog components maintain functionality under process variations. 
 
Table 6 – Summary of Figure 26 Results for the 
Differentiating Comparator Output 
 Deviation from TT Process 
Corner Measurement 
Rising Edge 1 Rising Edge 2 
SS Process Corner -215ps -130ps 
FF Process Corner +450ps +522ps 
   
46 
 
Figure 26 – Process Corner Analysis 
Top: CFD input and threshold voltage. Middle: Differentiating amplifier and comparator outputs. 
Bottom: Arming amplifier and comparator outputs. Shown are FF (Dashed), TT (Solid), and SS 
(Dot-Dashed) process corners; note that the SS process corner results in a differentiating 
comparator output offset voltage close to 800mV. 
 
47 
 
Figure 27 – System Bandwidth 
Amplitude Frequency Response in dB of: Top: Analog Delay and Input Buffer; Middle: 
Differentiating Comparator Amplifier; Bottom: Arming Comparator Amplifier. Note that 
frequency responses are for components connected to other analog components within the CFD 
as shown in Figure 23. 
 
Table 7 – Summary of Figure 27 Results 
Buffer Av0 -0.82dB 
f3dB 423MHz 
Analog Delay Av0 -0.82dB 
f3dB 295MHz 
Differentiating 
Amplifier 
Center Frequency (fc) 363MHz 
Gain @ fc 16.1dB 
Lower f3dB 155MHz 
Upper f3dB 691MHz 
Av0 0.678dB 
f3dB for Av0  1.95GHz 
Arming 
Amplifier 
Av0 18.7dB 
f3dB 362MHz 
   
 
Figure 27 shows the frequency response at the outputs of the analog 
components in the CFD. The analysis was conducted on the compiled CFD schematic 
48 
(see Figure 23). The key consideration with the amplitude frequency response of the 
analog components is to verify functionality for Gaussian-esque pulses of 2.5ns 
FWHM. Note that the differentiating amplifier has a frequency response similar to 
that of a band-pass filter, as expected [26]. The Lower and Upper 3dB cutoff 
frequencies are roughly half and twice the center frequency, respectively. This 
implies that some system functionality will be lost for Gaussian-esque pulses with 
roughly half or twice the FWHM of the designed 2.5ns. Simulation analysis in 
Cadence indicates that, for an input signal containing one 50mV pulse and one 
400mV pulse which are well-separated, the detection circuit maintains functionality 
with Gaussian-esque pulses having FWHM between 1.5ns and 5ns; small amplitude 
pulses are not detected outside this range and large deviations from this range result 
in a breakdown of system functionality. It is noted from transient simulations 
performed in Cadence that detection functionality is maintained for 100mV pulses 
with FWHM between 0.5ns and 10ns. It is concluded from this that the range of 
FWHMs over which system functionality is maintained increases with increasing 
threshold voltage; alternatively, this is equivalent to saying that there is a tradeoff 
between FWHM operating range and the minimum amplitude of a detectable 
Gaussian-esque pulse. 
Allowing more iterative steps in the design process as well as allowing for more 
variation on design parameters would optimize the design further; but again, doing 
so was outside of the scope of this thesis. The goal of this thesis was to explore the 
viability of an architecture. 
49 
3.2.6 CFD Logic and CFD Footprint 
 
Figure 28 – CFD Logic Schematic 
 
The CFD logic takes the digital output from the differentiating and arming 
comparators and combines them such that a rising edge is output from the CFD 
whenever there is a rising edge output from the differentiating comparator and the 
arming comparator is logic high. 
In a trial design, a simple AND gate was used as the CFD logic. Shown in Figure 
29 is an example of an input case where the AND-type CFD schematic does not meet 
the design goal. For CFDs implemented with an AND gate, it is possible to 
consistently generate a systematic false alarm. The false alarm occurs when the 
falling edge of the differential comparator output occurs after the arming comparator 
output rising edge from a subsequent pulse. This results in a false rising edge 
immediately after the falling edge for a correct detection on the CFD output; this is 
termed a Falling Edge False Alarm (FEFA) in this thesis. Among further complications 
are that FEFA occurs only for a range of pulse separations and these separations are 
dependent on FWHM for Gaussian input pulses as well as the CFD threshold voltage. 
50 
 
Figure 29 – AND-type CFD transient response for an input series of closely separated Gaussian 
pulses with varying separation. 
Separation decreases by 50ps for each subsequent pair of pulses shown. FWHM is 1.25ns for 
input pulses. Top: CFD Input and Threshold Voltage. Middle: Solid – Differential Comparator 
Output; Dashed – Arming Comparator Output. Bottom: CFD Output. Note that the false alarms are 
circled only to guide the eye; this does not affect the simulation. Also note that some delay is 
present between input and output signals. 
 
To solve the systematic false alarm problem, the AND gate was replaced with a 
DFF that is reset to logic LOW whenever either the Differential Comparator Output 
(DO) or the Arming Comparator Output (AO) are LOW. The DFF samples the AO on 
DO rising edges. This solution alters the CFD output logic function such that the 
output never rises on an AO rising edge. This eliminates FEFA. 
51 
 
 
Table 8 – AND Gate Logic 
DO TO Out 
0 0 0 
0 1 0 
1 0 0 
1 1 1 
 
 
 
 
  
→ 
 
Table 9 – Expanded 
AND Gate Logic 
DO TO OUT 
 1  
1   
 1  
1   
X 0 0 
0 X 0 
 
 
 
 
 
 
 
 
Table 10 – CFD Logic 
DO TO OUT 
 1  
1  – 
 1  
1   
X 0 0 
0 X 0 
 
Table 8, Table 9, and Table 10: “ ” = Rising Edge;  “ ” = Falling Edge;  “–” = Output stays the same 
 
The change in the logic is clarified in Table 8, Table 9, and Table 10. The Boolean 
logic for the AND gate shown in Table 8 is expanded to show the output rising/falling 
edge response in Table 9. The logic for the final proposed design is shown in Table 10. 
It is possible to calculate a minimum footprint estimate for the CFD. As 
mentioned elsewhere, the minimum estimate excludes footprint contributions from 
metal-layer connections between transistors; only the area contribution of 
transistors and other basic components is accounted for in the minimum footprint 
estimates. Table 11 summarizes the footprint estimates listed in Table 2, Table 3, 
Table 4, and Table 5 as well as footprint estimates listed in Appendix A. 
 
Table 11 – Minimum Footprint 
Estimate for CFD 
Component Footprint (µm²) 
Buffer 12.4 
Analog Delay 10.6 
Comparator (2x) 29.9 
DFFwRST0 0.672 
NAND Gate 0.096 
  
Total CFD Area 53.6 µm² 
Longest Dimension 22 µm 
  
 
52 
Since the total CFD area is much less than the area of a square with side lengths 
equal to the longest transistor dimension, a layout for the CFD could easily fit within 
a square of dimensions slightly larger than 22µm X 22µm. 
3.3 Timing Logic 
The timing logic receives input from the CFD. This subsystem generates control 
signals for the Track and Hold Analog Timers. Background theory concerning shift 
registers is integral in understanding the timing logic and is covered in Section 2.3. 
The theory governing the requirements of the timing logic is covered in Section 3.1, 
the design chapter introduction. 
The gate-level schematic of the timing logic for the proposed detector design is 
shown in Figure 30. 
 
 
Figure 30 – Timing Logic Schematic. 
 
An idealized version of the schematic is shown in Figure 31 to aid understanding 
of the timing logic. 
53 
 
Figure 31 – Idealized Timing Logic 
The ‘Logic’ block shown ensures that the circuit behavior is correct after a reset. Note that there 
are three memory element registers: B1, B2, and B3. These may be set to one of two states. ‘0’ 
indicates that the corresponding timing mechanism transmission gate is set to tracking mode. ‘1’ 
indicates that the corresponding timing mechanism transmission gate is set to holding mode. 
 
As discussed elsewhere, at least one analog timing branch must always be in 
tracking mode during the timing interval so that a new timestamp may be created. 
Consider the Idealized Timing Logic in Figure 31. When the circuit is reset at the start 
of a timing interval, all registers are set to ‘0’ which sets all analog timer branches to 
tracking mode. The ‘Logic’ block shown within the idealized timing logic initially only 
shifts a constant ‘1’ value into the first memory element of the shift register. When 
there is only one ‘0’ stored among the shift register memory elements, the ‘Logic’ 
block changes from a mode of only shifting a constant ‘1’ into the first memory 
element to a mode of shifting the content of the last memory element into the first 
memory element. In this way the Idealized Timing Logic changes into a circular shift 
register. Note that the shift register now contains exactly one ‘0’. There will be exactly 
one ‘0’ stored among memory elements for all future shift events until the circuit is 
54 
reset. In this way, at least one timing branch is always in tracking mode and the 
requirements of the timing logic are fulfilled. 
Consider Figure 32, which shows the schematic in Figure 30 with one of the 
circuit paths highlighted. When the output from the RS Latch is logic ‘1’, the NAND 
gate acts as a simple inverter for its other input. In this case, output from the 
right-most DFF, ‘Q̅’, is inverted to ‘Q’ and input to the left-most DFF. In such a case, 
the timing logic is transformed into an equivalent of a circular shift register where 
data is shifted along the path highlighted in the figure. Otherwise, when the output 
from the RS Latch is logic ‘1’, the output from the NAND gate is a constant ‘1’. From 
this, it is concluded that the circuit in Figure 30 behaves identically to the idealized 
timing logic shown in Figure 31. Also concluded is that the RS Latch and NOR gate in 
Figure 30 is analogous to the ‘Logic’ block in Figure 31. Note that when mentioning 
the outputs of the shift register memory cells, the author refers to the logic outputs 
at C1’, C2’, and C3’ as identified in Figure 30. 
 
 
Figure 32 – Timing Logic Schematic in Feedback Mode. 
 
The RS Latch and NOR gate serve the purpose of altering the mode of the timing 
55 
logic between a mode where a ‘1’ is shifted into the chain of shift registers, referred 
to as forward shift mode in this thesis, and a mode where a circular shift register is 
implemented, referred to as circular shift mode. The NOR gate takes the state of the 
last two shift register memory cells as input with one inverted as shown in Figure 30. 
As long as both of the last two shift register memory cells are ‘0’, the NOR gate also 
outputs ‘0’. Two shift events after a reset, the center DFF outputs a ‘1’ which causes 
the NOR gate to output a ‘1’ and set the RS Latch which sets the mode of the timing 
logic to circular shift. Upon circuit reset, the outputs of all shift register memory cells 
are set to ‘0’ in addition to the RS Latch being reset. Thus a reset causes the timing 
logic to go back into forward shift mode. 
Note that the proposed timing logic is expandable. An additional memory cell 
can be easily added to the logic by including an additional DFF between the first and 
second existing DFFs; additionally, another analog timing branch would be added to 
accommodate this change. 
For the purposes of this thesis, it is also important to note the footprint of the 
timing logic summarized in Table 12. As in other footprint estimates, this is a 
minimum estimate because the contribution of connections between transistors is 
not taken into account. 
 
Table 12 – Minimum Footprint Estimate for 
Timing Logic 
Component Footprint (µm²) 
DFFwRST0 (1x) 0.672 
DFFwRST0 (4x) 2.69 
NOR Gate 0.144 
NAND Gate 0.096 
  
Total Timing Logic Area 3.79 µm² 
Longest Transistor Dimension 0.48 µm 
  
56 
 
Since the hardware is easily expandable, it is useful to note that the marginal 
footprint cost for one memory element in the circular shift register is slightly more 
than 0.672 µm². 
3.4 Track and Hold Analog Timers 
The track and hold analog timing system receives control signals from the timing 
logic and generates timestamps. Changes in the control signals are directly timed by 
causing a sampling of an input timing ramp. The sampled analog voltage represents a 
time within the timing interval. Changes in the control signals are triggered by the 
input pulses after a near-uniform small delay; as a result, the output voltages 
represent timing measurements for the input pulses.  
A schematic of the timing mechanism is shown in Figure 33. As shown, there are 
three identical branches of the timing mechanism. As discussed elsewhere, this 
allows two timestamps to be output at the end of a timing interval; the third 
non-timestamp output will be the ending voltage of the input timing ramp. The 
non-timestamp output can be made to be an arbitrary voltage by changing the final 
voltage of the timing ramp before readout and reset because the non-timestamp 
track and hold branch is in tracking mode. 
57 
 
Figure 33 – 3-Branch Analog Timing Mechanism 
 
Theory for individual branches of the analog timing mechanism has been 
covered in Section 2.1, the section on the theory of analog timing. The behavior of 
the analog timing mechanism has also been covered briefly in Section 3.1, the 
introduction to the design chapter. 
As seen in Figure 33, there are three timestamp voltage outputs (Labeled Vt1, 
Vt2, Vt3; referred to as Vt generally) which result from one of three duplicate 
branches of the analog timing mechanism. Each branch can be in one of two modes: 
Track or Hold. When a branch is in Track mode the output, Vt, follows the voltage of 
the input ramp signal and the two voltages are approximately equal; that is to say, 
the input ramp signal propagates through the closed transmission gate to the output. 
When a branch is in Hold mode the output voltage, Vt, stays the same with negligible 
decay. A further property of each branch of the analog timing mechanism is that 
there is some rise/fall time when switching from Hold to Track. Three analog timing 
branches are used to measure the last two pulses within a signal. In this way, it is 
possible to hold two output timestamps while also always having one timing branch 
58 
in track mode. If a new pulse is detected within the signal after two pulses have 
already been detected, the timing branch in Track mode is switched to Hold and the 
branch holding the oldest timestamp is switched from Hold to Track mode. After the 
Hold to Track rise time, a new timestamp may be created. This rise/fall time 
determines the minimum resolvable separation between pulses within the input 
signal.  
An output buffer is present at each output to supply current to the outputs as 
well as to provide a holding capacitance for the transmission gates. The input buffer 
for the timing mechanism is necessary because, as previously stated, a duplicate of 
the proposed detector design is intended to be implemented in each unit cell within 
a focal plane. The input ramp signal in such a system would be supplied to every unit 
cell; thus, the input buffer prevents corruption of the input ramp signal due to 
excessive current draw resulting from a large number of unit cells. 
For design simplicity, the same design was used for all analog buffers in the Last 
Two Surface Range Detector design. As shown in Table 2 in Subsection 3.2.1, the 
input capacitance for the buffer is 24fF. Figure 34 shows the effects of the holding 
capacitor value on the output voltage. As seen in the figure, the effects of charge 
injection significantly alter the value of the output voltage as compared to the input 
voltage when a holding capacitor with low capacitance is used or, in the no-load case, 
when there is no holding capacitance. It is concluded that 24fF is sufficient for the 
design requirements. 
59 
 
Figure 34 – Effects of Transmission Gate Capacitive Load on Charge Injection. 
The dotted line shows the input signal; several outputs are plotted. 
 
Using the buffer design described in Subsections 3.2.1 and 3.2.2, the output 
voltage at Vt ranged from 352mV to 764mV in Cadence simulations for an input ramp 
starting at 300mV and rising to a final voltage of 800mV over a timing interval of 
200ns; this is a total voltage swing of 412mV. A larger voltage swing on the input 
ramp would result in a larger output voltage swing, but would result in significant 
non-linearity of the output voltages due to buffer transistors going out of saturation 
at the extremes of the voltage range. The speed of light is approximately 1 foot per 
nanosecond, and a light pulse from a lidar system travels to a target and then back to 
60 
the detector. Thus, a timing interval of 200ns results in a range interval for the lidar 
system of about 100 ft. (30m). From this, it is concluded that a difference of 4mV 
between two timestamps indicates that the two corresponding objects within the 
target scene observed by the lidar are separated by 1 foot of range. A summary of 
numbers involving the input ramp is shown in Table 13. 
 
Table 13 – Summary of Input Ramp and Output Voltage Swing 
Ramp Start Voltage 300mV 
Ramp Stop Voltage 800mV 
Timing Interval 200ns 
Minimum Output Voltage 352mV 
Maximum Output Voltage 764mV 
Output Voltage Swing 412mV 
Range Interval 98.3 ft. (30.0m) 
Change in Input Peak Time 
per Change in Output Voltage 
0.485 ns/mV 
Change in Range per Change 
in Output Voltage 
0.293 ft. / mV (7.28 cm/mV) 
 
 
A minimum allowable rise time was determined by simulating the rise and fall 
time for the transmission gate design in Cadence. An additional consideration of the 
minimum rise time is the allowable timing uncertainty. If the holding capacitance at 
the buffer input is not allowed to fully charge to the input ramp voltage and a 
timestamp is taken, then the timestamp voltage will not be equal to the ideal voltage 
it would have been had the holding capacitor been completely charged. In other 
words, the input and output voltages will not be the same; mismatch between the 
input timing voltage and the output timing voltage results in uncertainty. In the ideal 
noiseless case, a capacitor in an RC circuit requires an infinite amount of time to 
charge completely; as a result, there will always be some timing uncertainty. The 
61 
uncertainty due to the rise time of the RC circuit comprised of the transmission gate 
and the buffer input can be made arbitrarily smaller by increasing the minimum rise 
time and, therefore, the minimum allowable separation between two input pulses to 
the detector. Using Cadence simulations, it was determined that the difference 
between input and output voltages for an analog timing branch would not be more 
than 2mV if the branch was allowed a rise time of 2.8ns; this equates to roughly 
15cm of range according to the values in Table 13. If input pulses are separated more 
closely, more timing uncertainty will result. The fall time for the analog timing 
branches is also significant; it determines the amount of time which should be 
allowed between a reset and the start of a new timing interval. The rise and fall 
times for the timing branches are summarized in Table 14. 
 
Table 14 – Rise and Fall Times for Input 
Voltage Transitions between 300mV and 
800mV for Analog Timing Branches 
Rise Time 2.82ns 
Fall Time 2.21ns 
  
 
An alternate method of taking timestamps involves replacing the analogue 
timing mechanism with an equivalent digital timing mechanism. The alternate 
method would involve replacing the transmission gates with D flip-flops. This would 
require a digital equivalent of the timing ramp to be input to the timing mechanism 
with digital words representing linearly increasing time; the output would be a digital 
timestamp. The digital equivalent method would require an output wire for each 
digital bit of each timestamp output on each pixel. This method was avoided in the 
design in this thesis due to the anticipated interference on the analog input signal 
62 
that would be caused by the large number of digital inputs and outputs required for 
a digital timing mechanism; for this reason, the analog method was used. 
It is important to note the minimum on-chip footprint estimate for the analog 
timing mechanism. As noted elsewhere, the minimum footprint estimate does not 
account for on-chip footprint contribution from connecting wires between transistors 
and other major components. A summary of the footprint estimate for the analog 
timing mechanism is shown in Table 15. 
 
Table 15 – Minimum Footprint Estimate for 
the Analog Timing Mechanism 
Component Footprint (µm²) 
Buffer (1x) 12.4 
Transmission Gate (1x) 0.048 
Analog Timing Branch (1x) 12.4 
Analog Timing Branch (3x) 37.3 
  
Total Timing Mechanism Area 49.7 µm² 
Longest Transistor Dimension 22 µm 
  
 
Since the hardware is easily expandable, it is useful to note that the marginal 
footprint cost for one analog timing branch is slightly more than 12.4 µm². Note that 
it is not necessarily the case that an arbitrarily large number of timing branches may 
be added since the buffer between the input ramp and the holding capacitances 
cannot supply sufficient current to an arbitrarily large number of holding capacitors. 
3.5 Conclusion 
Figure 35 shows a sample transient response of the Last Two Surface Range 
Detector. As seen in the figure, the detector functions as expected. The detector 
correctly holds timestamps for the first and second pulse events, and then overwrites 
63 
the first timestamp when a third pulse is detected. In the figure from 20ns to 40ns, 
the detector holds the timestamps for the two most recently detected pulses. Then 
for each of the last two pulses, the detector correctly overwrites the oldest 
timestamp and creates a new timestamp for the most recently detected event. At the 
end of the timing interval, the voltages Vt1 and Vt2 are timestamps while Vt3 is 
equal to the voltage of the input ramp signal with some small gain applied. As 
mentioned elsewhere it is important to note that the voltage of the input ramp signal 
at the end of the timing interval, and thus the voltage of the non-timestamp output, 
can be chosen arbitrarily. It is noted that if no detections occur, then all of the output 
voltages will be equal to the non-timestamp voltage. Note that non-timestamp 
voltage and non-detection voltage are used interchangeably in this document. 
As seen in Figure 35, the delay between a pulse peak event and the generation 
of a new timestamp is roughly 2ns.  
Note that Figure 35 shows the noiseless case.  
64 
 
Figure 35 – Sample Transient Response for the Last Two Surface Range Detector 
Top: Detector input and threshold voltage. Bottom: Detector outputs and input ramp signal; also 
shown is the reset pulse that demarcates the start and stop of the timing interval. The FWHM for 
pulses in the detector input is 2.5ns. 
 
Before the start and after the end of the timing interval a 200ps reset pulse is 
applied to the circuit. It is observed in Figure 35 that after the reset pulse is applied 
just before the 1ns and 62ns marks, the detector resets correctly and another timing 
interval begins. As stated in Section 3.4, the output voltages require a fall time of 
2.21ns after a reset before a new timing interval can begin. 
Finally, the minimum footprint estimate for the proposed detector can be 
calculated from the values summarized in Table 11, Table 12, and Table 15. The 
footprint estimates are aggregated in Table 16. Note once again that the minimum 
footprint estimate does not account for metal-layer connections between transistors 
65 
and other major components. However, it can be expected that the additional space 
required for metal connections will be much less than the total footprint; also, it can 
be expected that the largest dimension will not be significantly affected. 
 
Table 16 – Minimum Footprint Estimate for the 
Proposed Detector 
Component Footprint (µm²) 
CFD 53.6 
Timing Logic 3.79 
Analog Timing Mechanism 49.7 
  
Total Detector Footprint 107.1 µm² 
Longest Dimension 22 µm 
  
 
Since the hardware is easily expandable to include an additional timestamp, it is 
useful to note that the marginal footprint cost for such an expansion is slightly more 
than 13.1 µm² assuming that additional metal connections would be required. 
The longest transistor dimension in the proposed detector is 22 µm. Since the 
area of a square with this side length is 484 µm², it is concluded that the detector 
could very easily fit within a space slightly larger than 22µm X 22µm with significant 
excess space for other circuits if desired. 
  
66 
4 PERFORMANCE ANALYSIS 
4.1 Introduction 
The sections in the Performance Analysis Chapter primarily describe the method 
and results of simulations; theory on the types of analyses covered in this chapter is 
covered in Section 2.4, Theory of Analysis. 
4.2 Timing Interval Uniformity 
Theory on Timing Interval Uniformity is covered in Subsection 2.4.1. 
In the analysis considered in this section, the output timestamp voltage was 
observed as a function of the input pulse time. Several simulations were run in which 
the input signal contained a single pulse with a known pulse peak time; the output 
timestamp voltage was recorded for each simulation. This analysis also serves as 
single-pulse functional verification. Figure 36 is provided to solidify the simulation 
concept for the reader. 
67 
 
Figure 36 – Input and Output for Several Single-Pulse Simulations 
The result of ten different simulations are shown; each is plotted in a different hue/shade 
with the hue/shade of the input matching the hue/shade of the corresponding output. 
 
Figure 37, Figure 38, Figure 39, and Figure 40 show the results of the simulations 
run. Table 17 shows a summary of results. Any residual between the actual output 
values and the expected output values is interpreted as uncertainty in the timing 
measurements. Comparisons in range uncertainty assume the values listed in Table 
13, namely the conversion between timestamp voltage and range from the detector 
to the detected surface: 7.28 cm/mV. 
68 
 
Figure 37 – Timing Uniformity TT Process Corner with Input Ramp Residual 
The waveform labeled ‘Data’ is the output of the detector for the simulations run; each data point 
is marked with a dot; each is the output timestamp voltage for a single pulse with a peak at the 
time indicated. Input pulse times ranged from 1ns to 198ns; points on the plot are spaced 1ns 
apart on the X-axis; 198 simulations were run. 
 
Figure 37 shows the relationship between input pulse peak time and output 
timestamp voltage in the TT process corner case as compared to the input timing 
ramp. Observing the figure, it is seen that using the timing ramp as the expected 
output results in a voltage uncertainty of 20-40mV when using the full timing interval. 
Using the relationship previously stated in Table 13; this results in 1.46-2.91m 
(4.79-9.55ft.) of range uncertainty. This is unacceptable for the application 
considered in this thesis. It is seen that the residual in Figure 37 is nearly linear with 
69 
respect to input peak time; this is because of the linear, non-unity amplification of 
the timing mechanism buffers. 
 
Figure 38 – Timing Uniformity TT Process Corner with Linear Fit Residual 
Note that the waveform of the linear fit may not be easily visible in figure top because it fits the 
output very closely. The waveform labeled ‘data1’ is the output of the detector for the simulations 
run; each data point is marked with a dot; each is the output timestamp voltage for a single pulse 
with a peak at the time indicated. Input pulse times ranged from 1ns to 198ns; points on the plot 
are spaced 1ns apart on the X-axis; 198 simulations were run. 
 
Figure 38 shows the relationship between input pulse peak time and output 
timestamp voltage in the TT process corner case as compared to the best linear fit 
for the output; the fit is defined by the equation shown in the figure. Observing the 
figure, it is seen that using the linear fit as the expected output results in a voltage 
uncertainty of less than 3mV when using the full timing interval; if the timing interval 
70 
is limited to the 5ns to 190ns range, a 2mV voltage uncertainty can be achieved. 
Using the relationship previously stated in Table 13; 2-3mV of timestamp uncertainty 
equates to 14.6-21.8cm (0.5-0.7 ft.) of range uncertainty. 
 
 
Figure 39 – Timing Uniformity SS Process Corner with Linear Fit Residual  
Note that the waveform of the linear fit may not be easily visible in figure top because it fits the 
output very closely. The waveform labeled ‘data1’ is the output of the detector for the simulations 
run; each data point is marked with a dot; each is the output timestamp voltage for a single pulse 
with a peak at the time indicated. Input pulse times ranged from 1ns to 198ns; points on the plot 
are spaced 1ns apart on the X-axis; 198 simulations were run. 
 
Figure 39 shows the relationship between input pulse peak time and output 
timestamp voltage in the SS process corner case as compared to the best linear fit for 
the output; the fit is defined by the equation shown in the figure. Observing the 
71 
figure, it is seen that using the linear fit as the expected output results in a voltage 
uncertainty of less than 7mV when using the full timing interval; if the timing interval 
is limited to the 10ns to 200ns range, a 2mV voltage uncertainty can be achieved. 
Using the relationship previously stated in Table 13; 2mV of timestamp uncertainty 
equates to 14.6cm (0.5 ft.) of range uncertainty. 7mV of timestamp uncertainty 
results in 51.0cm (1.7 ft.) of range uncertainty. 
 
 
Figure 40 – Timing Uniformity FF Process Corner with Linear Fit Residual  
Note that the waveform of the linear fit may not be easily visible in figure top because it fits the 
output very closely. The waveform labeled ‘data1’ is the output of the detector for the simulations 
run; each data point is marked with a dot; each is the output timestamp voltage for a single pulse 
with a peak at the time indicated. Input pulse times ranged from 1ns to 198ns; points on the plot 
are spaced 1ns apart on the X-axis; 198 simulations were run. 
 
72 
Figure 40 shows the relationship between input pulse peak time and output 
timestamp voltage in the SS process corner case as compared to the best linear fit for 
the output; the fit is defined by the equation shown in the figure. Observing the 
figure, it is seen that using the linear fit as the expected output results in a voltage 
uncertainty of less than 6mV when using the full timing interval; if the timing interval 
is limited to the 0ns to 180ns range, a 2mV voltage uncertainty can be achieved. 
Using the relationship previously stated in Table 13; 2mV of timestamp uncertainty 
equates to 14.6cm (0.5 ft.) of range uncertainty. 6mV of timestamp uncertainty 
results in 43.7cm (1.4 ft.) of range uncertainty. 
 
Table 17 – Summary of Timing Interval Uniformity Results 
 Process Corner 
TT SS FF 
Linear Fit Slope 2.12mV/ns 2.13mV/ns 2.09mV/ns 
Linear Fit Offset 347mV 347mV 348mV 
Full Interval Uncertainty 3mV 7mV 6mV 
2mV Uncertainty Interval 5ns - 190ns 10ns - 200ns 0 - 180ns 
    
 
Considering the results summarized in Table 17; an implementation of the 
detector that has less than 2mV of timestamp uncertainty due to non-linearity in the 
output timing measurements can be achieved by only using the timing interval from 
10ns to 180ns; equivalently, the input voltage ramp can be limited to the voltages 
between these times. Also noted is that the average value for the linear fit slope 
leads to a conversion factor from output timestamp to input pulse peak time of 0.473 
ns/mV which is similar to the value noted in Table 13. 
Timing interval uniformity was also considered for two pulse separation in the TT 
process corner. The two pulse timing interval uniformity analysis is similar to the 
73 
single pulse case. For the two pulse simulations, a pulse was present in the center of 
the interval in each simulation. The time of the peak for a second pulse was varied 
from simulation to simulation. The separation between pulses as well as the 
separation between the two measured timestamps was recorded and plotted in 
Figure 41. The gap in data at the center of the plot indicates that some simulations 
resulted in the detection of only one pulse due to the merging of the two Gaussian 
input pulses at close separations into a single Gaussian-esque pulse. 
A new data point set was created from the timestamp point set shown in Figure 
37 and Figure 38; the set of points was shifted by -95.5ns in the input peak time 
dimension and -550mV in the measurement voltage dimension. This was done so 
that the data set from the single pulse case could be compared to the two pulse case; 
the time separation for each data point from the two pulse case matches exactly to 
one of the times of the new point set. Ideally, the timestamp voltage for the new set 
and the separation voltage for the two pulse case would be equal. This would 
indicate that there is no difference in uncertainty between the single pulse case and 
the two pulse case; any timing measurement should be the same whether it is 
relative to the center of the timing interval or relative to a pulse peak at the center of 
the timing interval. The single pulse residual plotted in Figure 41 shows the observed 
difference between the results of the two methods of measurement. It is concluded 
from Figure 41 that the difference between the two methods of measurement is 
negligible for the purposes of this thesis and, while larger than the simulation 
uncertainty, it is on a similar order of magnitude. It was assumed that the difference 
between measurement methods would be similarly negligible for the SS and FF 
process corner cases. 
74 
 
Figure 41 – Two Pulse Separation Uniformity and Single-Pulse vs. Two-Pulse Residual 
Top: Two Pulse Peak Event Separation vs. Measurement Separation; each data point plotted 
shows the difference between the output timestamp voltages for two pulses with peaks separated 
by the amount of time indicated. Input pulse times ranged from 1ns to 198ns; points on the plot 
are spaced 1ns apart on the X-axis except for the two points to the left and right of 0ns separation; 
198 simulations were run. Bottom: Difference between points in figure top and the output 
timestamp voltages in Figure 37 where the former set of points has been shifted in time and 
voltage such that the two sets of points approximately match; note that the points match exactly 
on the time axis except for near 0ns where timing separation is undefined. 
 
For other simulations shown in Chapter 4, the chapter detailing the performance 
of the proposed detector, the output timestamps are shown in millivolts and in 
75 
nanoseconds. To convert between the two units, the observed relationships shown in 
Figure 38, Figure 39, and Figure 40 were used for each process corner run in 
simulation. This is done to mitigate systematic uncertainty due to imperfect 
calibration. If a realized implementation has systematic uncertainty due to imperfect 
calibration of the relationship between timestamp voltage and range to a surface, 
the expected output can be determined by combining the characteristics of the 
uncertainty of the realized system with the results shown in this document. 
 
4.3 Range Walk  
Theory on Range Walk is covered in Subsection 2.4.2. 
In the analysis considered in this section, the output timestamp voltage was 
observed as a function of the maximum amplitude of an input pulse at the center of 
the timing interval. Several simulations were run in which the input signal contained 
a single pulse with a known pulse peak amplitude; the output timestamp voltage was 
recorded for each simulation. 
Figure 42 and Figure 43 show the results of simulations run. Figure 42 shows the 
output timestamp voltage from the detector vs. the peak amplitude of the input 
pulse; each point plotted is the result of one simulation. Figure 43 shows the same 
data in Figure 42 after calibration using the mapping from timestamp voltage to input 
peak time observed in Section 4.2. Data from the figures is summarized in Table 18. 
Note that the data shown is for the noiseless case. Also note that a FWHM of 2.5 ns 
was used for input pulses. 
76 
 
Figure 42 – Range Walk in Volts for Three Process Corners 
The topmost, middlemost and bottommost data are for SS, TT, and FF process corners, 
respectively. 
77 
 
Figure 43 – Range Walk in Nanoseconds for Three Process Corners  
The topmost, middlemost and bottommost data in the plot center-left are for SS, TT, and FF 
process corners, respectively. 
 
Table 18 – Summary of Range Walk Results 
 Process Corner  
All Data SS TT FF 
Effective Threshold Voltage 
for Vth = 440mV 
439mV 445mV 458mV  
Difference Between Max. 
and Min. Timestamp Voltage 
0.8mV 0.8mV 0.8mV 4.6mV 
Difference Between Max. 
and Min. Measurement Time 
0.41ns 0.37ns 0.36ns 0.61ns 
     
 
Several things can be noted from the summarized results. Using the conversion 
factor 0.485 ns/mV listed in Table 13, it is determined that the difference between 
78 
the maximum and minimum timestamp voltage within the data for each process 
corner is equivalent to 0.4ns. This is comparable to the measurements after 
conversion to nanoseconds using the observations from Section 4.2. The 
observations closely match the expected response; intuitively this makes sense since 
each input pulse occurred at the same time within the timing interval; any variation 
in the measurement should primarily be caused by range walk. 
Also of note is that the difference between the maximum and minimum 
timestamp voltages among data for all process corners is equivalent to 2.23ns when 
using the conversion factor from Table 13. This difference in measurements resulting 
from calibrated and uncalibrated data is significant and is easily seen when 
comparing Figure 42 and Figure 43. This indicates, as discussed elsewhere, that 
process variation has a significant effect on the measurements and that any realized 
detector system would implement calibration of the output timestamps from 
chip-to-chip or pixel-to-pixel depending on performance requirements. A systematic 
timestamp uncertainty of 4.6mV represents a systematic range uncertainty of 33.5cm 
(1ft.). 
After calibration of the timestamp data using observations from Section 4.2, the 
difference between the two extreme data points is 0.61ns or 9.1cm (0.3 ft.) in range. 
This is an acceptable amount of uncertainty for the considerations of this thesis. 
Note that the uncertainty due only to range walk is equal to the difference in the two 
extreme measurements within the data for each process corner; in each case noted 
in Table 18, this is roughly 0.4ns or 6cm (0.2 ft.) in range. 
 
79 
4.4 Multi-Pulse Timing Separation Resolution and Separation Confusion 
Theory on Timing Separation Resolution and Separation Confusion is covered in 
Subsection 2.4.3. 
In the analysis considered in this section, the separation between timestamps for 
two closely separated Gaussian-esque pulses was observed as a function of the time 
separation between the peaks of the two pulses. Several simulations were run in 
which the input signal contained two pulses, each with a known peak time; the 
separation between the two timestamp outputs was recorded for each simulation. 
Figure 44 shows the result of simulations for the TT process corner. For each 
process corner, 111 simulations were run. The separation between the two 
underlying Gaussian pulses was varied by 0.1ns from simulation to simulation; these 
two pulses were summed to create the detector input signal. One pulse was located 
near the center of the timing interval while the peak time for the second pulse was 
varied. Input peak separations ranged from positive to negative 5.5ns. Here, positive 
and negative separation refers to the fact that the order of detection of center pulse 
and the varied pulse may change. The FWHM for both pulses was 2.5ns. Note that 
the separation of the underlying Gaussian pulse peaks is not necessarily equal to the 
separation of the two Gaussian-esque pulse peaks in the signal input to the detector; 
this explains the nonlinear spacing of data points close to Input Peak Time Separation 
equal to zero in the figure; at large separations, the spacing between data points is 
approximately uniform. This also explains the multiple data points in the figure with 
Input Peak Time Separation equal to zero; if two sufficiently closely separated 
Gaussian pulses are summed together, the resulting signal will contain only one peak 
value; this is true for a range of separations. The separation voltage for non-detection 
80 
appears as 0.2V in the figure; this is because the separation between the center of 
the voltage range and the maximum possible timestamp voltage is 0.2V. When only 
one event was detected, the first timestamp was near the center of the output 
voltage range and the second timestamp was equal to the non-detection voltage; the 
non-detection voltage was arbitrarily chosen to be the maximum possible timestamp 
voltage. 
 
 
Figure 44 – Two Pulse Separation Confusion for TT Process Corner 
Non-detections are highlighted. 
 
81 
Observing Figure 44, the detector confuses the two pulses as one pulse for 
separations closer than 2.9 ns for the TT process corner case; as expected, this is 
larger than the theoretical smallest resolvable separation. A summary of results for 
all process corners is shown in Table 19. Note that the minimum theoretically 
resolvable separation is determined using Eqn. 7 and is rounded to three significant 
figures. 
Table 19 – Summary of Multi-Pulse Separation 
Simulation Results 
Process Corner Confusion Separation [ns] 
FF 3.1 – 3.0 
TT 2.9 – 2.7 
SS 2.7 – 2.6 
  
Minimum Theoretically 
Resolvable Separation 
2.12ns 
  
 
Observing Table 19, the average confusion separation for the three process 
corners is 2.8ns which is equivalent to 42cm (1.4 ft.) in range. This is roughly 0.7ns 
larger than the minimum theoretically resolvable separation, which is equivalent to 
10cm (0.3 ft.) in range larger than the theoretical minimum. So, for the timing 
interval implemented in the simulations run (see Table 13 in Section 3.4), two objects 
that generate a return within a single pixel can only be resolved if the separation 
between them is larger than 42cm in range. Note that this dead time is similar to the 
rise and fall time of the track and hold components listed in Table 14 in Section 3.4; 
thus, assumptions made regarding the uncertainty of measurements made using the 
implemented track and hold components are not violated. 
It is expected based on the minimum theoretically resolvable range separation 
equation that the dead time between timestamps can be improved by decreasing 
82 
the FWHM time of the lidar light pulse. According to the results detailed in 
Subsection 3.2.5, the proposed design is flexible in this respect to some extent; 
however, significant reduction in lidar light pulse duration may require 
reconsideration of bandwidth capabilities for the detector. 
4.5 Uncertainty as a Function of Noise 
Timestamp timing uncertainty as a function of input noise standard deviation 
was determined. For the analysis considered in this section, a single Gaussian pulse 
was input to the detector near the center of the timing interval. Gaussian noise was 
added to the input signal. For each process corner and noise power considered, at 
least one hundred simulations were run; for each of these simulations the output 
timestamps were collected. The standard deviation of the timestamps for each 
process corner and noise power considered was calculated and plotted in Figure 46. 
Among simulations run, the added noise was not of sufficient amplitude to cause 
false alarms outside the FWHM of the input pulse; this was done to eliminate 
confusion as to the interpretation of the simulation results. In one observed 
simulation, the noise amplitude was sufficient to cause two detections close to the 
true detection time; both detections occurred no more than half the input pulse 
FWHM away from the true time. Both measurements were used in the calculation of 
the standard deviation for that process corner and noise power. 
The same input pulse was used for each simulation; a 2.5ns FWHM, 200mV 
amplitude Gaussian pulse near the center of the timing interval. The threshold 
voltage for the detector was arbitrarily chosen to be 10% of the full input voltage 
range, 40mV. The noise was generated in simulation by filtering white Gaussian noise 
83 
using a low-pass filter with a passband of 2 GHz, a stopband of 3 GHz, passband 
attenuation of less than 0.5dB, and stopband attenuation greater than 30dB; the 
sample frequency of the input noise was 10 GHz. For the purposes of the analysis 
considered in this section, the input noise can be approximated as Additive White 
Gaussian Noise (AWGN). It is anticipated that for an implementation of the proposed 
detector, noise with frequency content much greater than 1 GHz would be filtered 
out by the input buffer. The noise added to the input Gaussian pulse was 
independent for each simulation. 
An example plot of input and outputs for the detector with AWGN on the input 
is shown in Figure 45. The AWGN shown has a standard deviation of 16.3 mV, the 
largest among noise simulations run. 
84 
 
Figure 45 – Example transient simulation for single 2.5ns FWHM Gaussian input pulse with AWGN 
Top: Solid – Detector input; Dotted – Threshold voltage (Vth = 440mV). Bottom: Detector Outputs; 
Solid – Vt1; Dotted – Vt2 and Vt3. Note that the timing interval does not start at 0ns; rather, it 
starts at 1.5ns when the output voltages are at the minimum value and stops at 201.5ns. 
85 
 
Figure 46 – Timestamp Uncertainty Due to Input Signal Noise 
Results from the FF, TT, and SS process corner cases are shown. At least 100 simulations were 
used to generate each data point. Note that the graph origin can be considered as a data point for 
each process corner since it is expected that a lack of input signal variation would result in a lack 
of output variation. 
 
It is observed from Figure 46 that timestamp standard deviation is approximately 
linear with respect to the standard deviation of the noise added to the input pulse. 
Another important conclusion which is drawn from this analysis is that the detector 
is functional with significant AWGN on the input signal.
86 
5 CONCLUSION AND FUTURE WORK 
5.1 Conclusion 
A novel detection architecture for a flash lidar detector integrated in the focal 
plane was investigated. The novel architecture would enable multi-surface detection 
within each pixel for a direct-detect flash lidar system. The proposed design would 
require a footprint of at least 170.1 µm² with the largest dimension being 22 µm; 
contribution to footprint by metal-layer connections was not considered in the 
estimates provided and may contribute to a small increase in the footprint of a 
realized layout. The proposed architecture is easily expandable in hardware to allow 
the detection of additional surfaces; footprint is increased by 13.1 µm² for each 
additional timestamp output. 
Previous single-surface flash lidar detectors have achieved unit cell pitch of 
between 30µm - 10µm [10] [11] which is comparable to the proposed multi-surface 
detecting design. 
Various systematic measurement uncertainties for the detector were 
investigated and potential methods for uncertainty mitigation proposed. Additionally, 
it was verified that detector functionality was not disrupted by the introduction of 
moderate AWGN to the detector input. A summary list of performance analysis 
results and conclusions from Chapter 4 follows, note that a timing interval of 200ns is 
assumed as described in Table 13. 
  
87 
Performance Analysis Summary 
 
 
Section 4.2 Summary 
Timing Interval Uniformity 
 
 If the output timestamp voltages are assumed to be related to the input pulse 
peak times via only the slope and other characteristics of the input timing ramp, 
then the resulting systematic error between expected and actual timestamp 
voltage is on the order of tens of millivolts in output voltage or on the order of 
meters within a 30m range interval; this is unacceptable for the application of the 
detector. The large systematic error from this assumption results primarily from 
non-unity buffer amplification. 
 If a linear conversion from timestamp voltages to input pulse peak times is 
calibrated uniquely for each pixel, it is possible to achieve a systematic error 
between expected and actual timestamp voltage of 2mV if the 200ns timing 
interval assumed in this thesis is limited to the 10ns – 180ns range or otherwise 
that the input timing ramp voltage range is equivalently limited. 
 In other analysis sections, an ideally calibrated conversion from timestamp 
voltages to input pulse peak times which is unique to each pixel was recorded in 
the analysis detailed in this section and assumed for use; it is possible to 
determine the additional uncertainty due to imperfect calibration by considering 
the results of this section. 
 
Section 4.3 Summary 
Range Walk 
 For input pulses ranging from maximum to minimum amplitude within the input 
88 
voltage range of 800mV – 400mV the change in output timestamps due only to 
range walk within a single pixel is no more than roughly 0.4ns in time or 6cm in 
range.  
 Accounting for process variation and assuming ideally calibrated unique 
conversion between input peak time and timestamp time in nanoseconds for 
each pixel as discussed in Section 4.2, the variation in timestamp outputs due to 
variation in input pulse amplitude was 0.61ns in time or 9.1cm in range. 
 Accounting for process variation and assuming the same linear conversion 
between input peak time and timestamp time in nanoseconds across all pixels, 
the variation in timestamp outputs due to variation in input pulse amplitude was 
2.23ns in time or 33cm in range. 
 The conclusion that pixel-to-pixel calibration is likely necessary for any realized 
detector using the proposed design is supported by the results of this section. 
 
Section 4.4 Summary 
Multi-Pulse Timing Separation Resolution and Separation Confusion 
 
 The minimum resolvable range separation between two surfaces that generate a 
return within a single pixel is 42cm. It is expected that the dead time between 
timestamps can be improved by decreasing the duration of the lidar light pulse; 
such an improvement may require modification regarding the bandwidth 
capabilities of the proposed design. 
 
Section 4.5 Summary 
Uncertainty as a Function of Noise 
 Timestamp standard deviation is approximately linear with respect to AWGN 
89 
amplitude on the input and is expected to be predictable using an affine 
transformation. 
 The proposed design is functional with significant AWGN on the input. 
 
Overall, the investigation was successful; the footprint of the proposed 
multi-surface detector is comparable to existing single-surface detectors, and the 
design functions as intended. 
5.2 Future work 
Many opportunities exist for further continuation of work described in this thesis. 
It is possible to improve the analyses by increasing the sample sizes or by conducting 
new analyses not covered in this work. In addition, the purpose of this work was to 
explore the feasibility of an architecture; as such, further optimization of the 
proposed design is possible. Perhaps the most direct way of furthering this work 
would be to generate a layout for the proposed design and realize a physical 
implementation for analysis.  
The design proposed by this work includes three analog outputs. At the end of 
each timing interval one of the three analog timing outputs will always be the voltage 
at the end of the timing ramp, as a result it will always be unused in modeling the 
target scene. The unused output cannot be determined a priori without knowledge 
of the input signal. A possible improvement for the architecture proposed in this 
work is to use a multiplexer to decrease the number of outputs from three to two 
and eliminate the unused output. Alternatively, another method of implementing the 
Track and Hold subsystem might be developed which uses more than one T/H in 
90 
series so that a timing output can hold a voltage while a preceding stage would track 
the timing ramp; this method may require modification to the timing logic. It may be 
worth comparing these two methods of eliminating the superfluous output in 
regards to the goal of decreasing unit cell footprint. 
Investigation into a more efficient reset mechanism may be performed. Rather 
than the present mechanism, a reset could involve loading a pattern of all zeros and 
a single one into the circular shift register while simultaneously clearing the output 
buffers by momentarily connecting the holding capacitances to ground. Using this 
mechanism, a non-detection output would be equal to the ground voltage. It has also 
been noted that the NOR gate in the present reset mechanism can be eliminated. 
In the proposed design, the output timestamps are analog and require an 
external ADC to convert the values such that they can be used in a digital computer. 
A possible modification to this design is the replacement of the analog T/H system 
with digital timing. As noted elsewhere, this may result in interference with the input 
analog signal for the detector. A study involving simulation of layouts for both 
methods may be necessary to characterize the interference and determine its 
significance. Such an investigation could also consider the effects of one or several 
on-chip ADCs which would be used in combination with the analog method. 
As noted elsewhere, the proposed design is expandable easily in hardware 
because it is simple to add additional analog timestamp output branches as well as 
memory elements to the circular shift register for control of said branches. It can 
certainly be imagined that a dynamically expandable version of the proposed design 
could be developed if interest in such a project arose. 
Future work may include Layout, Fabrication, and Analysis of a manufactured 
91 
chip based on the proposed architecture with possible modification. 
  
92 
Appendix A 
SCHEMATICS AND SYMBOLS NOT SHOWN ELSEWHERE 
 
Contained in this appendix are schematics and symbols for components used in 
the detector design, many of which are not contained within the main body of this 
thesis. Table 20 shows a list of schematics included in this appendix. Also note that 
many of the symbols in this appendix are scaled up in size as compared to elsewhere; 
this is for viewing convenience of the reader. 
 
Table 20 – List of Schematics in Appendix A 
Figure Number & Description Page Number 
Figure 47 – Single-Ended Output, Differential Input CMOS Amplifier 92 
Figure 48 – DFFwRST0: D-Type Flip Flop with Reset-to-Zero Functionality 93 
Figure 49 – RS Latch 93 
Figure 50 – NAND Gate 93 
Figure 51 – NOR Gate 94 
Figure 52 – NOT Gate 94 
Figure 53 – Transmission Gate 95 
 
 
 
 
 
 
 
 
 
 
Figure 47 – Single-Ended Output, Differential Input CMOS Amplifier  
Left: transistor-level schematic. Right: symbol; Vinp and Vinn are marked with a ‘+’ and ‘-‘sign, 
respectively. 
93 
 
 
 
Figure 48 – DFFwRST0: D-Type Flip Flop with Reset-to-Zero Functionality. 
Total Transistor Area: 0.672µm² 
 
 
 
 
Figure 49 – RS Latch. Total Transistor Area: 0.288µm² 
 
 
 
 
 
 
 
 
 
 
  
Figure 50 – NAND Gate 
Left: transistor-level schematic. Right: symbol. Total Transistor Area: 0.096µm²  
 
94 
 
 
 
 
 
 
 
 
 
 
 
Figure 51 – NOR Gate  
Left: transistor-level schematic. Right: symbol. Total Transistor Area: 0.144µm² 
 
 
 
 
 
 
 
 
Figure 52 – NOT Gate  
Left: transistor-level schematic. Right: symbol. Total Transistor Area: 0.048µm² 
 
95 
 
 
 
 
 
 
Figure 53 – Transmission Gate  
Left: transistor-level schematic. Right: symbol; Note that C and C’ are the top inputs where C’ is 
the input with a bubble. Total Transistor Area: 0.048µm² 
 
96 
 
References 
 
[1]  M. Hansard, S. Lee, O. Choi and R. Horaud, Time of Flight Cameras: 
Principles, Methods, and Applications, Springer, 2012.  
[2]  G. Zhou, X. Zhou, J. Yang, Y. Tao, X. Nong and O. Baysal, "Flash Lidar Sensor 
Using Fiber-Coupled APDs," IEEE SESNSORS JOURNAL, vol. 15, no. 9, 
2015.  
[3]  V. Roback, R. Reisse, A. Bulyshev and F. Amzajerdian, "Helicopter Flight 
Test of 3-D Imaging Flash LIDAR Technology for Safe, Autonomous, 
and Precise Planetary Landing," United States: NASA Center for 
Aerospace Information, 2013. 
[4]  M. Monticello, "THE STATE OF THE SELF-DRIVING CAR," Consumer 
Reports, vol. 81, no. 5, pp. 44-49, May 2016.  
[5]  W. Zhang, Y. Chen, H. Wang, M. Chen, X. Wang and G. Yan, "Efficient 
registration of terrestrial LiDAR scans using a coarse-to-fine strategy 
for forestry applications," Agricultural & Forest Meteorology, vol. 225, 
pp. 8-23, 2016.  
[6]  M. Hollaus, C. Aubrecht, B. Hofle, K. Steinnocher and W. Wagner, 
"Roughness Mapping on Various Vertical Scales Based on 
Full-Waveform Airborne Laser Scanning Data," REMOTE SENSING, vol. 
97 
 
3, pp. 503-p523, 2011.  
[7]  F. Tsai, J.-S. Lai and Y.-H. Lu, "Full-Waveform LiDAR Point Cloud Land Cover 
Classification with Volumetric Texture Measures," Terrestrial, 
Atmospheric & Oceanic Sciences, vol. 27, no. 4, pp. 549-563, 2016.  
[8]  Z. Pan, C. Glennie, P. Hartzell, J. C. Fernandez-Diaz, C. Legleiter and B. 
Overstreet, "Performance Assessment of High Resolution Airborne 
Full Waveform LiDAR for Shallow River Bathymetry," Remote Sensing, 
vol. 7, no. 5, pp. 5133-5159, 2015.  
[9]  H. V. Duong, M. A. Lefsky, T. Ramond and C. Weimer, "The Electronically 
Steerable Flash Lidar: A Full Waveform Scanning System for 
Topographic and Ecosystem Structure Applications," IEEE 
TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, vol. 50, no. 
11, pp. 4809-4820, 2012.  
[10]  D. Stoppa, N. Massari, L. Pancheri, M. Malfatti, M. Perenzoni and L. Gonzo, 
"A Range Image Sensor Based on 10-�m Lock-In," IEEE JOURNAL OF 
SOLID-STATE CIRCUITS, vol. 46, no. 1, pp. 248-258, 2011.  
[11]  M. Perenzoni, N. Massari, D. Stoppa, L. Pancheri, M. Malfatti and L. Gonzo, 
"A 160 120-Pixels Range Camera With In-Pixel," IEEE JOURNAL OF 
SOLID-STATE CIRCUITS, vol. 46, no. 7, pp. 1672-1681, 2011.  
98 
 
[12]  S. Henzler, Time-to-Digital Converters, Dordrecht: Springer, 2010.  
[13]  D. A. Gedcke and W. J. McDonald, "A CONSTANT FRACTION OF PULSE 
HEIGHT TRIGGER FOR OPTIMUM TIME RESOLUTION," NUCLEAR 
INSTRUMENTS AND METHODS, vol. 55, pp. 377-380, 1967.  
[14]  W. R. Leo, "Constant Fraction Triggering (CFT)," in Techniques for Nuclear 
and Particle Physics Experiments: A How-to Approach, Berlin, 
Springer-Verlag, 1987, p. 319. 
[15]  B. Turko and R. Smith, "A PRECISION TIMING DISCRIMINATOR FOR HIGH 
DENSITY DETECTOR SYSTEMS," IEEE TRANSACTIONS ON NUCLEAR 
SCIENCE, vol. 39, no. 5, pp. 1311 - 1315, 1992.  
[16]  D. M. Binkley, "Performance of Non-Delay-Line Constant-Fraction 
Discriminator Timing Circuits," IEEE TRANSACTIONS ON NUCLEAR 
SCIENCE, vol. 41, no. 4, pp. 1169 - 1175, 1994.  
[17]  M. L. Simpson, C. L. Britton, A. L. Wintenberg and G. R. Young, "An 
Integrated, CMOS, Constant-Fraction Timing Discriminator for 
Multichannel Detector Systems," IEEE TRANSACTIONS ON NUCLEAR 
SCIENCE, vol. 42, no. 4, pp. 762-766, 1995.  
[18]  D. M. Binkley, B. S. Puckett, B. K. Swann, J. M. Rochelle, M. S. Musrock and 
M. E. Casey, "A 10-Mc/s, 0.5-µm CMOS Constant-Fraction 
99 
 
Discriminator Having Built-In Pulse Tail Cancellation," IEEE 
TRANSACTIONS ON NUCLEAR SCIENCE, vol. 49, no. 3, pp. 1130-1140, 
2002.  
[19]  S. Kim, G. Ko, S. Kwon and J. Lee, "Development of a non-delay line 
constant fraction discriminator based on the Padé approximant for 
time-of-flight positron emission tomography scanners," Journal of 
Instrumentation, vol. 10, 2015.  
[20]  White, Peter; Applied Radio Labs, "Group Delay - Explanations and 
Applications - DN004," 15 November 1999. [Online]. Available: 
http://www.radio-labs.com/DesignFile/DN004.pdf. [Accessed 2016]. 
[21]  T. R. Kuphaldt, "Chapter 12 SHIFT REGISTERS," in Lessons In Electric 
Circuits, Volume IV – Digital, 2007.  
[22]  M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions: 
With Formulas, Graphs, and Mathematical Tables, 10 ed., 1972.  
[23]  E. W. Weisstein, "Full Width at Half Maximum," From MathWorld--A 
Wolfram Web Resource, [Online]. Available: 
http://mathworld.wolfram.com/FullWidthatHalfMaximum.html. 
[24]  K. K. Sharma, Optics: principles and applications, Amsterdam: Academic 
Press, 2006.  
100 
 
[25]  T. C. Carusone, D. A. Johns and K. W. Martin, "3.8 MOS DIFFERENTIAL PAIR 
AND GAIN STAGE," in ANALOG INTEGRATED CIRCUIT DESIGN, 
Hoboken, NJ, John Wiley & Sons, 2011, pp. 135-137. 
[26]  B. P. Lathi, Linear Systems and Signals, New York: Oxford University Press, 
2005.  
[27]  N. H. E. Weste and D. M. Harris, CMOS VLSI Design: A Circuits and Systems 
Perspective, Addison-Wesley, 2011.  
 
 
