A mite based translinear fpaa and its practical implementation by Abramson, David
A MITE BASED TRANSLINEAR FPAA AND ITS
PRACTICAL IMPLEMENTATION
A Dissertation
Presented to
The Academic Faculty
By
David Abramson
In Partial Fulfillment
of the Requirements for the Degree
Doctor of Philosophy
in
Electrical and Computer Engineering
School of Electrical and Computer Engineering
Georgia Institute of Technology
December 2008
Copyright © 2008 by David Abramson
A MITE BASED TRANSLINEAR FPAA AND ITS
PRACTICAL IMPLEMENTATION
Approved by:
Dr. Paul E. Hasler, Advisor
Professor, School of ECE
Georgia Institute of Technology
Atlanta, GA
Dr. David V. Anderson
Professor, School of ECE
Georgia Institute of Technology
Atlanta, GA
Dr. Maysam Ghovanloo
Professor, School of ECE
Georgia Institute of Technology
Atlanta, GA
Dr. James Hamblen
Professor, School of ECE
Georgia Institute of Technology
Atlanta, GA
Dr. Bradley Minch
Professor, School of ICT
Olin College of Engineering
Needham, MA
Date Approved: September 2008
ACKNOWLEDGMENTS
I would like to thank my advisor, Paul Hasler, and all of my colleagues within the
CADSP research group for all of their help. I would not have been able to complete this
dissertation without them. In addition, I would like to that my mother, Judy Abramson, for
helping support me during my stay in graduate school. Finally, I would like to wife, Masha,
who provided me with all the support, love, and encouragement one could ever need.
iii
TABLE OF CONTENTS
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
CHAPTER 1 ANALOG RECONFIGURABILITY AND DESIGN ABSTRAC-
TION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 General Reconfigurable Analog Architectures . . . . . . . . . . . . . . . 1
1.2 Questions of Analog Reconfigurability . . . . . . . . . . . . . . . . . . . 3
1.3 The Translinear FPAA Design Flow and Supporting Circuitry . . . . . . . 5
CHAPTER 2 FLOATING-GATE TRANSISTORS . . . . . . . . . . . . . . . . 8
2.1 Modifying the Floating-gate Charge . . . . . . . . . . . . . . . . . . . . . 8
2.2 Programming Arrays of Floating Gates . . . . . . . . . . . . . . . . . . . 10
2.3 Floating Gates in Reconfigurable Systems . . . . . . . . . . . . . . . . . 13
2.3.1 Floating-gate Switches . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.2 Programmable Elements . . . . . . . . . . . . . . . . . . . . . . 16
2.3.3 Offset Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.4 Programming Considerations . . . . . . . . . . . . . . . . . . . . 18
CHAPTER 3 MULTIPLE INPUT TRANSLINEAR ELEMENTS . . . . . . . 21
3.1 Implementation of a Multiple Input Translinear Element . . . . . . . . . . 21
3.2 Synthesis Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Programmable MITEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4 Building Blocks of MITE systems . . . . . . . . . . . . . . . . . . . . . 27
3.4.1 Translinear Loops . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4.2 Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
CHAPTER 4 RECONFIGURABLE MITE ARCHITECTURES . . . . . . . . 34
4.1 The RAAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.1.1 Original System Architecture . . . . . . . . . . . . . . . . . . . . 35
4.1.2 Original MITE CAB . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1.3 Examples and Results . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 An Improved Translinear FPAA . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.1 Improved System Architecture . . . . . . . . . . . . . . . . . . . 47
4.2.2 Improved MITE CAB . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2.3 Examples and Results . . . . . . . . . . . . . . . . . . . . . . . . 50
iv
CHAPTER 5 FROM EQUATION TO HARDWARE: THE SOFTWARE IN BE-
TWEEN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.1 Network Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2 Place-and-route . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.3 Hardware Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
CHAPTER 6 VOLTAGE-CURRENT AND CURRENT-VOLTAGE CONVERT-
ERS FOR SYSTEM INTERFACING . . . . . . . . . . . . . . . 86
6.1 A VI converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.2 A logarithmic bidirectional IV converter . . . . . . . . . . . . . . . . . . 89
CHAPTER 7 A FLOATING-GATE PIPELINED ADC . . . . . . . . . . . . . 97
CHAPTER 8 THE PRESENT AND FUTURE OF TRANSLINEAR FPAAS . 102
8.1 Integration of Fixed Blocks . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.2 Optimization of the Software Chain . . . . . . . . . . . . . . . . . . . . . 104
8.3 Mixed-signal Reconfigurable Platforms . . . . . . . . . . . . . . . . . . . 104
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
v
LIST OF TABLES
Table 5.1 Exponent Patterns Generated with Different Gate Connections . . . . . 71
Table 6.1 Harmonic Distortional Analysis of Improved VI Converter (dB) . . . . 89
vi
LIST OF FIGURES
Figure 1.1 Power Consumption trend of DSPs compared to analog signal pro-
cessing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Figure 1.2 FPAA enabled design flow . . . . . . . . . . . . . . . . . . . . . . . . 2
Figure 1.3 Example of programmability in a general FPAA architecture . . . . . 3
Figure 1.4 Parasitics introduced by switching elements . . . . . . . . . . . . . . 5
Figure 1.5 Design flow using a translinear FPAA . . . . . . . . . . . . . . . . . . 6
Figure 2.1 Schematic of a floating-gate pFET . . . . . . . . . . . . . . . . . . . 9
Figure 2.2 I-V traces for a programmed floating gate . . . . . . . . . . . . . . . . 9
Figure 2.3 Band diagrams illustrating electron tunneling . . . . . . . . . . . . . . 10
Figure 2.4 Diagram of hot-electron injection . . . . . . . . . . . . . . . . . . . . 11
Figure 2.5 Array isolation of a floating gate for programming . . . . . . . . . . . 12
Figure 2.6 Resistance curves of a floating-gate pFET . . . . . . . . . . . . . . . 14
Figure 2.7 Resistance comparison of switches . . . . . . . . . . . . . . . . . . . 15
Figure 2.8 A simple floating-gate current source. . . . . . . . . . . . . . . . . . . 16
Figure 2.9 A simple floating-gate voltage source. . . . . . . . . . . . . . . . . . 17
Figure 2.10 Example of offset removal using floating gates. . . . . . . . . . . . . . 18
Figure 2.11 Problem with programming multiple switches that share a drain line . 19
Figure 2.12 Floating-gate switch programming improvements . . . . . . . . . . . 20
Figure 3.1 Possible MITE realizations . . . . . . . . . . . . . . . . . . . . . . . 22
Figure 3.2 Subthreshold MOSFET realization of a MITE . . . . . . . . . . . . . 23
Figure 3.3 Methods used to program the charge on a MITE . . . . . . . . . . . . 26
Figure 3.4 Schematic of a 2nd-order translinear loop . . . . . . . . . . . . . . . . 28
Figure 3.5 Simulation results of the translinear loop . . . . . . . . . . . . . . . . 29
Figure 3.6 Schematic of a 1st-order low-pass log-domain filter . . . . . . . . . . . 31
Figure 3.7 Simulation results of 1st-order low-pass filter . . . . . . . . . . . . . . 32
vii
Figure 4.1 System architecture of the RAAM. . . . . . . . . . . . . . . . . . . . 35
Figure 4.2 Schematic of a MITE CAB . . . . . . . . . . . . . . . . . . . . . . . 37
Figure 4.3 Layout of the RAAM . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Figure 4.4 Schematic of squaring circuit implemented on the RAAM . . . . . . . 38
Figure 4.5 Compilation of a squaring circuit onto the RAAM . . . . . . . . . . . 39
Figure 4.6 Results of the squaring circuit . . . . . . . . . . . . . . . . . . . . . . 41
Figure 4.7 Compilation of a translinear loop circuit onto the RAAM . . . . . . . 42
Figure 4.8 Results of the translinear loop . . . . . . . . . . . . . . . . . . . . . . 44
Figure 4.9 Results of the square root circuit . . . . . . . . . . . . . . . . . . . . 45
Figure 4.10 Results of the vector magnitude circuit . . . . . . . . . . . . . . . . . 45
Figure 4.11 Frequency response of the log-domain filter . . . . . . . . . . . . . . 46
Figure 4.12 System architecture of the improved MITE FPAA . . . . . . . . . . . 48
Figure 4.13 Basic MITE computation element of the improved MITE FPAA . . . . 49
Figure 4.14 Log-domain filter of the improved MITE FPAA . . . . . . . . . . . . 50
Figure 4.15 Layout of the MFPAA . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Figure 4.16 Results of a coefficient multiplication. . . . . . . . . . . . . . . . . . 53
Figure 4.17 Kappa variation over the standard operation range of a MITE. . . . . . 54
Figure 4.18 Theoretical response of a 2nd-order translinear loop including κ variation. 55
Figure 4.19 Results of the improved coefficient multiplication. . . . . . . . . . . . 56
Figure 4.20 Results of a squaring circuit on the MFPAA. . . . . . . . . . . . . . . 57
Figure 4.21 Circuit to implement a cube root on the MFPAA. . . . . . . . . . . . . 58
Figure 4.22 Results of a cube root circuit on the MFPAA. . . . . . . . . . . . . . . 59
Figure 4.23 Results of a Iout = I
2/3
in circuit on the MFPAA. . . . . . . . . . . . . . 60
Figure 4.24 Results from a first-order low-pass filter. . . . . . . . . . . . . . . . . 61
Figure 4.25 Results from a first-order high-pass filter. . . . . . . . . . . . . . . . . 62
Figure 4.26 Results from a RMS-to-DC converter. . . . . . . . . . . . . . . . . . . 63
Figure 4.27 Topology of the geometric mean current splitter. . . . . . . . . . . . . 65
viii
Figure 4.28 Results from a Geometric Current Splitter. . . . . . . . . . . . . . . . 66
Figure 4.29 Compilation of a sinusoidal oscillator on the MFPAA. . . . . . . . . . 67
Figure 5.1 Depiction of Equation Parsing . . . . . . . . . . . . . . . . . . . . . . 69
Figure 5.2 Sample Output of Network Synthesis Function . . . . . . . . . . . . . 73
Figure 5.3 Sample Output of Place-and-Route Function . . . . . . . . . . . . . . 77
Figure 5.4 MFPAA CAB routing diagram . . . . . . . . . . . . . . . . . . . . . 78
Figure 5.5 MFPAA CAB offsets . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Figure 5.6 MFPAA CAB programming information . . . . . . . . . . . . . . . . 80
Figure 5.7 MFPAA CAB details . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Figure 5.8 MFPAA I/O CAB details . . . . . . . . . . . . . . . . . . . . . . . . 83
Figure 5.9 Sample output of the GUI for interfacing with the FPAA . . . . . . . . 84
Figure 5.10 Laboratory setup of the full system. . . . . . . . . . . . . . . . . . . . 85
Figure 6.1 Improved VI converter of the improved MITE FPAA . . . . . . . . . . 87
Figure 6.2 Layout of the improved VI converter . . . . . . . . . . . . . . . . . . 87
Figure 6.3 Simulated Transfer Characteristic of the Improved VI Converter . . . . 88
Figure 6.4 Simulated Frequency Response of the Improved VI Converter . . . . . 89
Figure 6.5 Schematic of the bidirectional logarithmic IV converter . . . . . . . . 90
Figure 6.6 DC Results of the IV converter . . . . . . . . . . . . . . . . . . . . . 92
Figure 6.7 Schematic of the improved bidirectional logarithmic IV converter . . . 93
Figure 6.8 Output Characteristic of Improved IV Converter . . . . . . . . . . . . 95
Figure 6.9 Simulated Transient Response of Improved IV Converter . . . . . . . 96
Figure 7.1 Architecture of the pipelined ADC . . . . . . . . . . . . . . . . . . . 98
Figure 7.2 Architecture of the 1-bit stage . . . . . . . . . . . . . . . . . . . . . . 100
Figure 7.3 Floating-gate voltage reference . . . . . . . . . . . . . . . . . . . . . 100
Figure 7.4 Layout of the ADC converter . . . . . . . . . . . . . . . . . . . . . . 101
Figure 7.5 Simulation results of the 10-bit ADC . . . . . . . . . . . . . . . . . . 101
Figure 8.1 Complete system implementing an equation in hardware. . . . . . . . 103
ix
SUMMARY
While the development of reconfigurable analog platforms is a blossoming field, the
tradeoff between usability and flexibility continues to be a major barrier. Field Programmable
Analog Arrays (FPAAs) built with translinear elements offer a promising solution to this
problem. These FPAAs can be built to use previously developed synthesis procedures for
translinear circuits. Furthermore, large-scale translinear FPAAs can be built using floating-
gate transistors as both the computational elements and the reconfigurable interconnect
network. Two FPAAs, built using Multiple Input Translinear Elements (MITEs), have
been designed, fabricated, and tested. These devices have been programmed to implement
various circuits including multipliers, squaring circuits, current splitters, and filters. In ad-
dition, synthesis, place-and-route, and programming tools have been created in order to
implement a reconfigurable system where the circuits implemented are described only by
equations. Supporting circuitry for interfacing with current-mode, translinear FPAAs has
also been developed. This circuity included a voltage-to-current converter, a current-to-
voltage converter, and a pipelined analog-to-digital converter. The continued development
of translinear FPAAs will lead to a reconfigurable analog system that allows for a large
portion of the design to be abstracted away from the user.
x
CHAPTER 1
ANALOG RECONFIGURABILITY AND DESIGN ABSTRACTION
One of the biggest breakthroughs in the field of digital integrated circuits has been the
field-programmable gate array (FPGA). This is not only because of the rapid prototyping
that they enable, but also because they opened up the use of digital circuits to those without
expertise in the field. While field-programmable analog arrays (FPAAs) are attempting to
fill a similar void in the analog field, they have not been developed to a point where they
are being adopted by designers. FPAAs are being developed at a time when analog signal
processing is on the rise, due to the power savings they offer over the traditionally digital
solutions. As shown in Figure 1.1 [1], analog signal processing would offer approximately
a 20 year leap in power savings compared to digital signal processors (DSPs). In addition
to offering significant power savings, a reconfigurable analog platform would allow the
user to prototype designs, cutting down on the fabrication cycle and facilitating a faster
time to market. This idea is illustrated in Figure 1.2 [1]. Note that an FPAA eliminates
the need for multiple fabrication cycles by enabling rapid hardware prototyping. It is also
important to note that the synthesis step shown in Figure 1.2b, is often only a compilation
step consisting of place-and-route and hardware programming. Actual network synthesis,
the automated creation of a circuit from a behavioral description, would eliminate the need
for the design stage.
1.1 General Reconfigurable Analog Architectures
The basic reconfigurable architecture for implementing analog circuits consists of a bank of
analog components and an interconnect network. The bank of analog components, known
as a Configurable Analog Block (CAB), contains fixed analog circuits and programmable
elements. The programmable elements, normally implemented as voltage or current DACs,
1
Figure 1.1. The power consumption trend is shown for DSPs. Moving from digital to analog signal
processing can make an improvement in power efficiency that is equivalent to a 20 year
leap.
Figure 1.2. A comparison of the traditional analog design flow, shown in (a), and the FPAA enabled
design flow, shown in (b). Note that the synthesis stage in the FPAA enabled design flow is
actually a compilation stage consisting of place-and-route and hardware programming.
2
I
in
I
out
S
1
S
2
S
3
Figure 1.3. Example of programmability in a general FPAA architecture. A binary weighted current
mirror is used to create a variable gain for current scaling. Switches S 1 through S 3 are
controlled digitally.
are used to vary the exact computation being calculated without changing the overall be-
havior of the circuit. For example, a binary weighted programmable current mirror might
be used to vary the coefficient of a current-mode multiplication. This is illustrated if Figure
1.3. Switches S 1 through S 3 are controlled digitally and their settings are stored in memory
cells.
The interconnect networks in most reconfigurable analog platforms are made up of
either transmission gates or pass gates. This network allows the elements in the CAB to be
connected in various topologies, giving the user the ability to change the overall behavior of
the system. These switch matrices may be full crossbar networks or reduced tree networks
depending on the level of reconfigurability needed.
1.2 Questions of Analog Reconfigurability
While FPGAs have been developed for commercial use, FPAAs have not had the same
success. The chief reasons for this is the “non-existence of an abstracted universal block
which could be used to systematically realize analog circuits (such as the concept of gates
in digital design)”[2]. This can be seen by comparing FPAAs currently on the market or
under development. Anadigm’s FPAA and their software package, Anadigm Designer, use
3
switched-capacitor circuits to realize the users desired circuit. On the other end of the gran-
ularity spectrum are Field Programmable Transistor Arrays (FPTAs) which use transistors
that must be connected together with switches to realize the user’s circuit [3]. In addition
there are FPAAs that are built using only gm-C filters [4, 5], opamps and passive compo-
nents [6], and transconductors [7]. There are also FPAAs that try to solve this problem
by using a mixture of analog blocks to realize circuits. The Reconfigurable Analog Signal
Processor (RASP) includes a mixture of operational transconductance amplifiers (OTAs),
transistors, passive components, and some higher level blocks in order to maintain flexibil-
ity while being able to compile a wide spectrum of analog circuits [1].
Another problem limiting the development of FPAAs is the inherent trade off between
flexibility and the appropriate level of abstraction. Most of the current FPAAs tend to one
of the extremes in this tradeoff. For example, FPTAs are highly flexible but offer almost
no real level of abstraction. On the other extreme are some of the gm-c based FPAAs
that have high abstraction levels, filter designs, but do not have any true flexibility. This
tradeoff is also seen in the tools used to interface with the reconfigurable platform. For
example, platforms without an appropriate level of abstraction struggle to incorporate any
type of synthesis into their tools, while platforms with high levels of abstraction and limited
flexibility can include synthesis in their tool packages but for very narrow scopes.
In addition to this tradeoff, achieving high flexibility often comes at the cost of intro-
ducing an unreasonable amount of parasitics into the realized circuits. This is primarily due
to the fact that higher flexibility often means larger interconnect networks. The intercon-
nect networks consist of switching elements in order to reconfigure the analog hardware
being used. These switching elements, often made of transmission gates, introduce both
parasitic resistances and parasitic capacitances into the signal path. This idea is illustrated
in Figure 1.4. While the drawbacks of the parasitic resistances can be reduced by routing
signals that do not require any current draw, i.e. signals that go only to gates of MOSFETs,
the parasitic capacitances will affect the both the speed and stability of the implemented
4
(a)
(b)
R
parasitic
C
parasitic
C
parasitic
Figure 1.4. A transmission gate, shown in (a), is the most common switching element found in recon-
figurable analog platforms. It consists of nFET and pFET pass transistors in parallel in
order to achieve relatively low resistance over all voltages. However, the transmission gate
introduces both a parasitic resistance and two parasitic capacitances along the signal path,
shown in (b).
circuits.
1.3 The Translinear FPAA Design Flow and Supporting Circuitry
The use of translinear circuits as the universal analog block can help reduce the tradeoff be-
tween flexibility and abstraction level. Using translinear circuits, for which known network
synthesis procedures exist, it is possible to build a system in which the only input necessary
is the set of equations that describe the system to be implemented. The translinear FPAA
will be able to implement a wide range of circuits, including all linear, polynomial, and
rational static equations and most differential equations, while requiring the user to per-
form no actual analog design. This idea is illustrated by the translinear FPAA design flow,
5
System Equations
Circuit Netlist
Hardware Implementation
Working System
FPAA Mapping
Synthesis
Place-and-route
Programming
Testing
Figure 1.5. Design flow using a translinear FPAA. Using translinear circuits allows the user to enter a
set of equations which is then netlisted using existing synthesis procedures. The circuit is
then place-and-routed and the system is programmed onto the FPAA.
shown in Figure 1.5. Unlike the traditional FPAA design flow, there are no design or simu-
lation steps required to create the working system. This will allow those with a background
in math, controls, physics, or many other fields to easily interact with the FPAA. This is
similar to how those with a background in computer programming can easily interact with
FPGAs.
In order to make the use of a translinear FPAA practical, supporting circuitry must
also be created. Most importantly, translinear circuits are current-mode and thus require
the use of both voltage-to-current (VI) and current-to-voltage (IV) converters in order to
easily interface them with other circuits or systems. These converters must be accurate,
fast, and cover a very large dynamic range. The dynamic range of the systems being built
can be quite large as the translinear elements themselves have dynamic ranges on the order
of 80dB − 100dB. In addition, in order to make interfacing even easier DACs and ADCs
should be included. This dissertation will explore the idea of a translinear FPAA, including
6
this supporting circuitry, and present a system which implements the improved design flow.
7
CHAPTER 2
FLOATING-GATE TRANSISTORS
At the core of our reconfigurable systems are floating-gate transistors. They make it
possible to effectively shift the threshold of the transistor, allowing us to turn switches on
and off, precisely set bias values, and to cancel offsets due to threshold mismatch. Further-
more, the computational elements of the system will be floating-gate devices as well.
Floating-gate transistors have a gate that is completely surrounded by silicon dioxide,
allowing for the storage of charge on the gate of the transistor. In subthreshold, a floating
gate pFET, shown in Figure 2.1, has a current-voltage relationship
I = Ioe
Vs−κV f g
UT (2.1)
where κ is the capacitive division between the oxide capacitance and the depletion capaci-
tance, UT is kT/q and V f g, the voltage on the floating gate, is
V f g = Vg
C1
CT
+ Vo f f set (2.2)
where CT is the total capacitance at the floating gate and Vo f f set is determined by the charge
on the floating gate. Note that equations 2.1 and 2.2 are for a device in saturation and Vs
and V f g are measured with respect to the bulk.
2.1 Modifying the Floating-gate Charge
Two methods are used to alter the charge on the floating gate. Fowler-Nordheim tunneling
is used to remove electrons from the floating gate and hot-electron injection is used to add
electrons to the floating gate [8]. From equations 2.1 and 2.2 it is seen that the offset term
introduced by the charge on the floating gate essentially changes the threshold voltage of
the transistor. This can be seen in Figure 2.2 in which I-V sweeps are performed on the
same transistor with different amounts of charge on the floating gate.
8
Vg
C1
Vfg
Vd
Vs
I
Vtun
Figure 2.1. Schematic of a floating-gate pFET. The gate of the transistor is completely surrounded by
silicon dioxide, allowing charge to be stored there.
0 0.5 1 1.5 2 2.5 3
10
−12
10
−10
10
−8
10
−6
10
−4
Gate Voltage (V)
Ch
an
ne
l C
ur
re
nt
 (A
)
Figure 2.2. Shifted I-V traces for a programmed floating gate. Each curve was taken on the same device
with identical terminal voltages. The charge on the floating gate was modified for each trace
in order to shift the threshold of the device.
9
SiO
2 SiO
2
E
c
E
c
E
c
E
c
Floating Gate
Floating Gate
V
tun
V
tun
(a) (b)
Figure 2.3. Band diagrams illustrating electron tunneling. a) Band diagram when tunneling is not
occuring, Vtun is set to Vdd. b) Band diagram during tunneling, Vtun is set to a large voltage.
Fowler-Nordheim electron tunneling works by applying a large voltage across the tun-
neling capacitor. The large field across the capacitor thins the energy barrier allowing elec-
trons to tunnel through the barrier, effectively increasing the charge on the floating gate.
This transition is shown as the change in Figure 2.3a to Figure 2.3b. Note that electrons
leave the floating gate when the tunneling voltage is high, removing negative charge from
the node.
Hot-electron injection works by applying a large source-drain voltage to the pFET while
current is flowing through the channel. The process is outlined in Figure 2.4. The holes
are accelerated toward the drain until they collide with other atoms at the drain edge of
the drain-channel depletion region creating an electron-hole pair, shown as step 1 in the
figure. The electron is then accelerated, by the large field, back towards the source. While
most of these electrons end up back in the well the transistor is fabricated in, step 2, some
of these“hot” electrons gain enough energy that they can escape through the oxide of the
transistor and add negative charge to the floating gate, shown as step 3 in the figure. This
effectively reduces the charge on the node.
2.2 Programming Arrays of Floating Gates
In order for floating-gate transistors to be used in large-scale reconfigurable systems, an
array programming scheme must exist to selectively alter the charge on a single floating
10
n-well
p-substrate
p+p+n+
well
contact
drainsource gate
Drain-to-Channel
Depletion Region
p+
drain
Channel (1)
(2)
(3)
gate
Figure 2.4. Diagram of hot-electron injection. Holes are accelerated, by a large source-drain voltage,
towards the drain where they collide with the ions at the edge of the drain-channel depletion
region creating an electron-hole pair (1). Most of the “hot” electrons created from the
impact-ionization return to the substrate (2), but some gain enough energy, as they are
accelerated by the large source-drain electric field, to overcome the barrier of the silicon
dioxide and become trapped on the floating gate (3).
11
R0
R2
R1
C0 C3C1 C2
Drain Control
Volta ge
Ga te  Control
Voltage
Figure 2.5. Isolation of a floating gate for programming. Injection is prevented in the undesired
columns by setting the gate of the pFET to Vdd, making sure there is no channel current,
and in the undesired rows by setting the drain of the pFET to Vdd, making sure there is not
a large source-drain voltage. Figure reprinted from [9].
12
gate. This process has been previously developed [8]. In order for practical selectivity
in a two-dimensional array, two parameters must be required in order for programming
to take place. This way one parameter each can be applied to the chosen element’s row
and column. Electron tunneling, which only requires a large voltage across the tunneling
capacitor, must thus only be used as a global erase. Hot-electron injection, however, can
be used to selectively program an array of floating gates. This is possible because injection
requires two parameters in order to occur, a high drain-source voltage and current in the
channel of the device. The selection process is outlined in Figure 2.5 [9]. The undesired
columns are turned off by setting their gate voltage to Vdd, making sure no current flows
through the channel of the device. The undesired rows are turned off by setting their drains
to Vdd, making sure they do not have a large enough drain-source voltage for injection.
2.3 Floating Gates in Reconfigurable Systems
Floating-gate transistors play an important role in the development of large-scale recon-
figurable analog platforms. They can be used as reconfigurable switch matrices, as pro-
grammable bias elements, and to remove offsets in the compiled systems. However, in
order to accomplish this, slight changes in the programming structure will be needed.
2.3.1 Floating-gate Switches
The ability to turn switches on and off, and to do it quickly, is key to creating a useful
reconfigurable system. In our MITE based FPAAs, a single floating-gate pFET is used
as a programmable switch. While using a single pFET, rather than a transmission gate,
as a switch saves space and introduces less parasitic capacitance, its resistance increases
exponentially as the signal approaches ground. This, however, can be fixed by injecting
the pFET so that its threshold voltage becomes positive, about 3V to 4V (This uses the
convention that the threshold voltage of a pFET is usually negative, about −0.8V). This
assures that even as the source (and drain) of the device approach ground, the effective
voltage at the floating gate is still 3V to 4V below the source. In other words, a large
13
0 0.5 1 1.5 2 2.5 3
104
105
106
107
108
109
V
s
 = V d + 25mV (V)
Re
sis
ta
n
ce
 
(
Ω
)
Decreasing Effective
Gate Voltage
Figure 2.6. Resistance curves for a floating-gate pFET injected to different levels. The gate voltage is
held constant and the source and drain of the pFET are swept. A 25mV difference is kept
between the source and drain in order to measure the resistance of the switch. Note that as
the effective floating-gate voltage is decreased the resistance becomes more constant as the
source voltage of the pFET is swept.
14
0 0.5 1 1.5 2 2.5 3
103
104
105
106
Voltage (V)
Re
sis
ta
nc
e (
Ω
)
pFET
transmission gate
FG pFET
Figure 2.7. Comparison of resistance curves for IC switches. Note that the floating-gate switch has
nearly the same resistance as the transmission gate but contributes only half of the parasitic
capacitance. Figure reprinted from [1].
Vsg is still maintained. This allows the resistance of the pFET to be fairly constant all the
way from Vdd to gnd. Figure 2.6 shows resistance curves for the same floating-gate pFET
injected to different levels.
Resistance curves for different types of switches are shown in Figure 2.7. Reconfig-
urable systems that do not use floating gates use either transmission gates or single transis-
tors as switches. These switches are usually controlled by SRAM cells. Note that the use
of a simple single transistor switch cannot provide a reasonable on-resistance from rail-to-
rail. Also note that while the transmission gate provides a slightly lower resistance from
rail-to-rail than the floating-gate pFET does, the resistance of the floating-gate transistor
actually varies less over the entire operating range. More importantly, while having similar
resistances, the floating-gate switch contributes only half the parasitic capacitance that the
transmission gate contributes.
15
Vtun
Vb
Vcas
Iout
Vd
Prog
Figure 2.8. A simple floating-gate current source. By modifying the charge on the floating gate, it is
possible to set any value of output current for a fixed Vb. The drain of the floating-gate
transistor is accessed during Program mode.
2.3.2 Programmable Elements
In addition to acting as switches, floating-gate transistors can be used to create the pro-
grammable elements of a reconfigurable system. These programmable elements mainly
consist of voltage and current biases. A simple floating-gate current source is shown in
Figure 2.8. By taking advantage of the floating-gate it is possible to program this source
to any value for a given bias voltage, Vb. This allows an unlimited number of bias currents
to share a single bias voltage. In addition, this bias can replace the binary current mirror,
shown in Figure 1.3, with only two transistors. This saves a great deal of die area. The
drain voltage of the floating gate, Vd, is accessed for programming purposes by asserting
the Prog signal. When this occurs, the cascode voltage, Vcas, is switched to Vdd for the
entire chip.
Floating-gate voltage sources are also possible. By buffering the voltage stored on a
16
Vtun
Vb
Vd
Vout
Figure 2.9. A simple floating-gate voltage source. By modifying the charge on the floating gate, it is
possible to set any value of output voltage for a fixed Vb. Any amplifier can be used to
buffer the floating gate, including a simple source follower.
floating gate, a programmable voltage bias is created. As shown in Figure 2.9, The floating-
gate transistors gate and drain can be accessed directly by the programming circuitry. The
choice of amplifier to use is based upon the requirements of the voltage source. It can range
from a complex operational amplifier to a simple source follower. Again, it is possible to
share a single bias voltage, Vb, between an unlimited of programmable voltage sources.
2.3.3 Offset Removal
A major problem with large-scale analog reconfigurable systems is device mismatch. This
is because devices connected to each other may be located across the chip from one an-
other. Floating-gate transistors can be used to cancel this mismatch by programming their
respective charges properly. Not only can the mismatch in the floating-gate transistors be
accounted for, but the mismatch due to the reconfigurable architecture (current mirrors,
switches, etc.) can also be attenuated by accounting for their offset error when program-
ming the floating-gate devices. This is not as large of a problem with customized hardware
because current mirrors, and other devices that rely on matched elements, are typically fab-
ricated as a single unit and layout techniques are used to minimize error due to mismatch.
An example of offset removal is shown in Figure 2.10 [10]. As voltage offsets lead
to scaling errors in current-mode systems, the offsets reveal themselves as multiplication
17
Figure 2.10. Example of offset removal using floating gates. A current-mode multiplication circuit is
shown. The voltage offsets, which lead to scaling errors in the output current, are removed
by floating-gate programming.
errors. Figure 2.10 shows results from a current-mode multiplier before and after removing
offset errors. Note that the act of programming the floating gates to remove the offset error
shifts the curves vertically on the logarithmic plot. This is equivalent to changing the value
of the multiplication.
2.3.4 Programming Considerations
In order to program switches so that they can pass a full range of voltages through them,
the standard programming algorithm must be altered slightly. The problem arises from the
fact that once a switch has been turned on, programming selectivity is lost. Selectivity is
lost because the threshold voltage of the floating-gate pFET must be shifted above Vdd in
order for the pFET to be able to pass voltages down to ground. This implies that the device
can no longer be shut off by setting the gate voltage to Vdd because current will still flow
through the channel of the device. This will cause the device to inject if the drain terminal
shared by that column of switches is pulsed. In addition, once a switch is programmed, it is
18
Vd
Sel Selbar
G0
Id
+
_
G1 G2 G3
Id * R
Figure 2.11. Problem with programming multiple switches that share a drain line. The resistance in-
troduced by the programming circuitry causes a voltage drop when current is flowing
through a switch on the drain line. This effectively lowers the source-drain voltage used to
inject the switches on that drain line.
no longer possible to read bias currents on the same drain line. Again, this is because their
is a large current flowing through the injected switch that cannot be shut off by setting the
gate voltage to Vdd. This can be solved by making sure that bias transistors, as well as any
device that needs to be accurately programmed, do not share drain lines with switches.
Another problem when programming switches is the inability to inject two switches on
the same drain line to the same level. This is because the standard programming algorithm
injects one switch after another. Once a switch is turned on, a large current flows through
the drain line creating a voltage drop across the transmission gate used for selecting the
given drain line. This causes the the drain-source voltage set during injection to be lower
than desired when programming a second switch on the same drain line. This idea is
illustrated in Figure 2.11.
In order to solve this problem, the standard floating-gate array was modified. Firstly,
rather than directly injecting the device that will act as the switch, indirect programming
is used. As shown in Figure 2.12a, indirect programming uses another pFET, called the
indirect programming transistor(IPT), for injecting electrons on to the floating gate that it
shares with the switch. This allows the threshold voltage of the switch to be modified by
IPT. Secondly, because the IPT is not in the signal path, additional circuitry can be added
19
to aid with independent selection of an array of IPTs. A third pFET can be added in series
with the source of the IPT to make sure that no current can flow through the unselected
IPTs. This idea is illustrated in Figure 2.12b. Note that the new selection pFET does not
introduce any parasitics into the signal path.
Vtun
Vg
Vs
Vd
Vsp
Vdp
(a)
Vtun
Vg
Vs
Vd
Vsp
Vdp
Sel
(b)
Figure 2.12. Floating-gate switch programming improvements. Indirect programming can be used
to allow a second pFET to modify the floating-gate charge of the primary floating-gate
transistor, shown in (a). Vsp and Vdp are the drain and source voltage of the indirect
programming transistor. This allows for extra selection circuitry to be included in the
floating-gate array without introducing parasitics into the signal path, shown in (b). The
third pFET is controlled by a selection signal that ensures current cannot flow through the
programming transistor when it is not selected.
20
CHAPTER 3
MULTIPLE INPUT TRANSLINEAR ELEMENTS
At the core of the improved FPAA are translinear elements. Ideal translinear elements
have infinite input impedance, infinite output impedance, and an exponential voltage to
current relationship independent of the current level they are operating at. In addition, any
translinear element can be made to have multiple inputs by simply applying resistive or
capacitive division at the voltage input. Multiple input translinear elements (MITEs) can
thus be built using either subthreshold MOSFETs or BJTs, each of which is stronger in one
of the above specifications [11].
Some of the possible realizations of MITEs are shown in Figure 3.1. The most basic
MITE structures are a subthreshold MOSFET with capacitive or resistive summing at the
input or a BJT with resistive summing at the input. Each structure has an advantage over
the other–the subthreshold MOSFET has a much higher input impedance than the BJT, but
the the BJT has a much larger dynamic range in which it holds the exponential voltage to
current relationship. Both implementations suffer from finite output impedance, but this
can be improved by adding a cascode transistor to each. Finally, the advantages of each
design can be combined by using a CMOS source follower for the input stage and a BJT
for the votlage-to-current output stage. This idea is illustrated in Figure 3.1e. Again, the
output stage is cascoded using a MOS transistor.
3.1 Implementation of a Multiple Input Translinear Element
In order to allow for the practical implementation of our FPAAs in a simple digital process,
we have chosen to use subthreshold MOSFETs. A subthreshold MOSFET has a current
that is exponentially related to its gate voltage and is given by
I = Ise
κVg−Vs
UT
(
1 − e Vs−VdUT
)
(3.1)
21
Vg1
Vd
Vd
Vcas Vcas
Vcas
Vd
Vd
Vb
Vd
Vg2
Vg1
Vg2
Vg1
Vg2
Vg1
Vg2
Vg1
Vg2
(a) (b)
(c)
(e)
(d)
Figure 3.1. Possible realizations of a MITE. (a) A single subthreshold transistor used to implement a
MITE. (b) A single BJT used to implement a MITE. (c) The single subthreshold transistor
is cascoded to improve the output impedance. (d) The single BJT is cascoded to improve
the output impedance. (e) A CMOS source follower and a BJT are combined to achieve
infinite input impedance and a large dynamic range.
22
Vpcas
V1
V2
Vn
C1
C2
Cn
V1
V2
Vn
Vd
Vd
(a) (b)
Figure 3.2. Subthreshold MOSFET realization of a MITE. a) Components used to realize a MITE in a
standard CMOS process. b) Symbol used to represent a MITE.
where Is is a scaling term, κ is the capacitive division between the oxide capacitance and the
depletion capacitance, and UT is kT/q. Note, that all voltages are referenced to the bulk.
Furthermore, as long as the device is in saturation, Vds > 100mV , the second exponential
term can be neglected.
Again, in order to allow for the practical realization of the MITE in a standard digi-
tal process, capacitive division is used for the introduction of multiple inputs. Figure 3.2a
shows the subthreshold MOSFET realization of a MITE, and its current-voltage relation-
ship is given by
I = Ise
Vs−κ∑ (wiVi)
UT (3.2)
where wi, the weight applied to an input, is given by
Ci
CT
(3.3)
where CT is the total capacitance at the gate of the MOSFET. Figure 3.2b shows the symbol
that will be used for this realization of a MITE. Note that while the subthreshold MOSFET
23
does have nearly infinite input impedance, the range in which the relationship between
current and voltage is exponential is limited. However, by making the WL ratio of the MITEs
larger, this range can be increased. Currently the MITEs exhibit the correct behavior over
approximately 4 decades of current.
3.2 Synthesis Procedures
Numerous synthesis procedures have been derived for the construction of generic translin-
ear networks [12]. In addition, two synthesis procedures have been developed specifically
for MITE networks [11, 13]. While both synthesis procedures produce circuits that are
efficient in terms of the number of elements used, they have significant differences. Most
importantly, the first synthesis procedure, developed by Brad Minch, considers each output
of the system separately, while the second, developed by Shyam Subramanian, considers
the entire system at once. The ability to synthesize a system with multiple outputs allows
for the construction of the system with the fewest number of MITEs possible. Secondly,
Minch’s procedure treats static and dynamic functions in a similar manner, while Subrama-
nian’s procedure reduces dynamic functions into static functions and first-order low-pass
filters. Both of these methods have advantages over each other. Minch’s procedure can
synthesize any ordinary differential equation that can be written in terms of elementary
functions [11], but the circuits can be quite complex. On the other hand, Subramanian’s
procedure is limited, as it cannot successfully generate every dynamic circuit, but generally
results in simpler circuits since they are broken down into static functions and first-order
filters.
In addition, the synthesis procedures discussed above can be slightly altered to accom-
modate the architecture of the reconfigurable systems. Most importantly, the fixed number
of input capacitors on each MITE must be considered. Both Minch’s and Subramanian’s
procedure can be easily altered to limit the number of input capacitors to those available.
This is because the use of extra input capacitors can be replaced with the use of extra MITEs
24
with the same input current. This idea can be extended to show that any MITE network
can be constructed using only MITEs with two input capacitors. While this is possible, it
may be impractical for complex systems, because the number of MITEs required would
increase dramatically.
3.3 Programmable MITEs
A major limitation of previous MITE networks has been the threshold mismatch between
individual MITEs. These mismatches have made it difficult to produce large systems that
work properly. However, because MITEs are floating-gate elements, it is possible to pro-
gram the charge on the floating gate in order to remove this mismatch. The first attempts to
do this involved subjecting the chip to ultraviolet (UV) light in order to equalize the charge
on all of the floating gates. While this did improve the matching of the MITEs, it was not
accurate enough to facilitate the implementation of large MITE systems.
Now, with the development of programming methodologies, both for generalized float-
ing gates and specifically for MITEs, it is finally possible to implement large systems. This
is the case because the charge on the floating gate of each MITE can now be adjusted inde-
pendently of the others. For example, MITEs have been used to implement adaptive filters,
a system much too complex to build before the programming advances [14]. However,
systems as large as chaotic oscillators, such as a Lorentz attractor, have been successfully
fabricated since the development of the programming algorithms[15].
There are two different methods that can be used to control a MITE’s terminal voltages
in order to program the given MITE. The first method is more straightforward, but it uses
more space and introduces more parasitics than the second method. The second method,
however, can only be used under certain conditions.
The first access method, shown in Figure 3.3a, is a simple adaptation of the standard
floating gate programming structure. In order to directly control the gate voltage of the
MITE, each gate capacitor has two transmission gates attached to it in order to multiplex
25
(a)
Drain-line
Gate-line
Vp_cascode
Vg1
Vg2
Vg3
Vg4
Vd
Prog
(b)
Drain-line
Vp_cascode
Vd
Prog
Gate-line
Prog
Vn_cascode
Figure 3.3. Two methods for adapting MITEs to an array programming scheme. a) Simple adapta-
tion of the standard floating gate programming structure. Transmission gates are used to
connect each gate capacitor to the given gate voltage in programming mode. b) Cascode
programming scheme in which a nFET cascode transistor is used to isolate the gate termi-
nal during programming mode. This method assumes that all gate capacitors are connected
to the drain of a MITE (the node between the two cascode transistors).
the gate between a programming voltage and its gate voltage in run mode. In addition, a
transmission gate is also used multiplex the drain of the floating-gate transistor between a
programming voltage and its drain voltage during run mode. Unlike the gate lines, a single
pFET, also used as the cascode transistor during run mode, is used to disconnect the drain
of the MITE from the rest of the circuit in programming mode. This is simply done by
setting the pFET’s cascode voltage to Vdd.
The second access method uses the fact that every gate capacitor, with a few excep-
tions, in a MITE network is connected to the drain of the cascode transistor of the MITE.
By using an nFET cascode, as well as the pFET cascode, to remove this node from the
circuit during programming mode, the gate voltage of the MITE can be set using this node.
The drain of the floating gate is treated the same as in the first access method. This pro-
gramming method is illustrated in Figure 3.3b. Note that this method requires only one
transmission gate, as well as the use of a nFET cascode transistor, in order to set the gate
voltage during programming rather than two transmission gates for each gate cap. Again,
26
the nFET cascode transistor is shut off by setting its cascode voltage to gnd.
Each programming method has advantages over the other. Clearly, the second method
uses less space and introduces less parasitic capacitances into the circuit. However, the sec-
ond method must be altered slightly when the MITE network includes a reference voltage.
The reference voltage must be replaced by the gate control voltage during programming
in order to couple the correct voltage onto the floating gate. More importantly, the second
method cannot be used in a reconfigurable architecture where floating-gate switches will
be connected to the gate caps. This is because when setting the gate voltage of the MITE
using the second method, the drain (or source) of the floating-gate switch will be set to the
same voltage. In other words, the selectivity inherent in the array programming scheme
may be lost.
3.4 Building Blocks of MITE systems
In order to build complex systems using MITEs, it is necessary to explore what higher level
components are commonly used. Clearly, translinear loops are a building block of almost
every system, as they implement multiply functions. In addition, log-domain filters are
commonly used, and their use will be emphasized because of Dr. Subramanian’s synthesis
procedure.
3.4.1 Translinear Loops
Translinear loops are well documented building blocks of almost every translinear system
[11, 12]. In a reconfigurable system, fixed loops are used to reduce the amount of recon-
figurability needed. A translinear loop, shown in Figure 3.4, can be analyzed by simply
solving for each MITE’s diode connected voltage. However, to simplify the analysis it is
necessary to assume that all of the floating gates have an equal amount of charge on them.
This will cancel the offset term due to the programmed charge on the floating gate. Under
this assumption, the equations are
27
I1
I2 I3 I4
V1 V2 V3 Vref
Figure 3.4. MITE implementation of a 2nd-order translinear loop.
V1 =
2UT
κ
log
I1
Is
− Vre f (3.4)
V2 =
2UT
κ
log
I2
Is
− V1 (3.5)
V3 =
2UT
κ
log
I3
Is
− V2 (3.6)
V3 + Vre f =
2UT
κ
log
I4
Is
(3.7)
where the factor 2 is the ratio of Ci/CT . Substituting Equations 3.4-3.6 into Equation 3.7
and rearranging gives
log
I1
Is
+ log
I3
Is
= log
I2
Is
+ log
I4
Is
(3.8)
which can also be written as
I1I3 = I2I4 (3.9)
In addition to the standard analysis, it is interesting to note that both MITE synthesis pro-
cedures output this circuit configuration when Equation 3.9 is entered.
28
0 50 100 150 200 250
0
50
100
150
200
250
300
350
400
450
500
Input Current [nA]
O
ut
pu
t C
ur
re
nt
 [n
A]
Figure 3.5. Simulation results of the translinear loop. The multiplication coefficients were chosen to be
1
10 ,
1
4 ,
1
2 , 1, 2, 4, and 10.
29
This circuit is most often used as a multiplier with
Iout =
IaIb
Ic
. (3.10)
Simulation results of the translinear loop are shown in Figure 3.5. Data was taken as Ia was
swept and the coefficient Ib/Ic was held constant. Sweeps were taken for coefficients of
1/10, 1/4, 1/2, 1, 2, 4, and 10. For higher coefficients the trace is not completely straight
because the MITEs leave the subthreshold region due to the higher current levels.
3.4.2 Filters
Filters were included as higher level blocks for two reasons: they are a building block
of almost every dynamic system and they are used as a building block by Subramanian’s
synthesis procedure. The synthesis of the circuit, found in [11], is similar to the synthesis
of the loop, but first the constraint equations are needed. The differential equation for a
first-order low-pass filter is
τ
d
dt
Iy + Iy = Ix (3.11)
where Ix is the input current, Iy is the output current, and τ is the time constant of the filter.
The chain rule can be applied to the derivative of the current giving
τ
dIy
dVy
dVy
dt
+ Iy = Ix (3.12)
where Vy is the log compressed voltage associated with Iy. Taking the derivative of the
current through the 2-input MITE with respect to a single controlling voltage can be shown
to result in
− τ κIy
2UT
dVy
dt
+ Iy = Ix (3.13)
where κ2 is the weight of the controlling voltage Vy. Noting that C
dVy
dt is a capacitive current
and τκIyC can be written as a reciprocal of a bias current that sets the time constant of the
filter produces
Iτ − Ic = IxIτIy (3.14)
30
Ix Itau Itau
Iy
Ic
Ip
Figure 3.6. MITE implementation of a 1st-order low-pass log-domain filter. The Iτ bias current con-
nected to the capacitor is used to set the corner frequency of the filter. The second bias
current is set to Iτ in order to maintain unity gain.
where Ic is the current through a capacitor and Iτ is a bias current that sets the time constant
of the filter. This leaves
Ip =
IxIτ
Iy
(3.15)
and
Iτ − Ic = Ip (3.16)
as the two equations needed for synthesis. The circuit, shown in Figure 3.6, can now be
constructed using Minch’s synthesis procedure. Note that the circuit is essentially the same
as the translinear loop, but a capacitor is added to introduce the pole in the transfer function.
In addition, a gain term can be added to the transfer function by multiplying the second Iτ,
the bias current for the MITE without the capacitor on its drain, by the coefficient desired.
The simulation results of the filter are shown in Figure 3.7. Frequency responses were
taken for different values of Iτ. The values used were 0.1nA, 1nA, 10nA, 100nA, and 1µA.
Both bias currents were kept equal in order to have unity gain through the filter. Note that
31
graphicx
10
0
10
2
10
4
10
6
10
8
−70
−60
−50
−40
−30
−20
−10
0
Frequency [Hz]
O
ut
pu
t M
ag
ni
tu
de
 [d
B]
Figure 3.7. Simulation results of 1st-order low-pass filter. Iτ was set to 0.1nA, 1nA, 10nA, 100nA, and
1µA for the different curves respectively.
32
the cutoff frequencies of the filter are equally spaced on the log scale, as are the values of
the bias current. The bumps seen in the response of the filter can be attributed to the non-
idealities introduced by the gate capacitors of the MITEs. These bumps can be eliminated
by moving to a differential structure where two filters share a single reference voltage.
33
CHAPTER 4
RECONFIGURABLE MITE ARCHITECTURES
One of the most important decisions when building a reconfigurable system, is the
granularity chosen for the reconfigurability. If a fine granularity is chosen, the architecture
does not need high level components, but rather these components can be built out of the
lower level components. However, this architecture has the downside of introducing a great
deal of parasitic resistances and capacitances into the circuits compiled onto the system. If a
coarse granularity is chosen, the architecture requires a number of higher level components
that can be reconfigured into complex systems. For MITE systems, these blocks include
current splitters, translinear loops, and filters. This architecture tends to require fewer
switches and therefore introduces fewer parasitics. However, this architecture is also less
flexible than the fine grain architecture.
Two distinct architectures for reconfigurable analog systems that use MITEs as their
computational components have been created. While both architectures are designed to
implement similar sets of functions, they use different techniques to construct the given
circuits. The first architecture has very fine grain reconfigurability meant to as flexible as
possible. The second architecture has a mixed granularity of reconfigurability that improves
upon the first architecture. Both architectures have been designed to take advantage of
Subramanian’s synthesis procedure.
4.1 The RAAM
The Reconfigurable Analog Array of MITEs (RAAM) is an FPAA that focuses on fine
granularity and the use of a known synthesis procedure. The RAAM was meant to show
the promise of a reconfigurable translinear platform. Although it was limited to a very
small die area and contained rather simple components, the RAAM was able to compile a
significant amount of useful circuits.
34
Programming Structure
P
ro
g
ra
m
m
in
g
 S
tr
u
c
tu
re
Global
Switch
Matrix
MITE CAB
MITE CAB
Specialized CAB
MITE CAB
Figure 4.1. System architecture of the RAAM, an FPAA used to create reconfigurable translinear net-
works. The system consists of 3 MITE CABS, a specialized CAB, and a global switch net-
work. The specialized cab consists of circuitry that enables four-quadrant dynamic func-
tions and also includes the input bank of V-I converters.
4.1.1 Original System Architecture
Standard FPAA approaches with similar granularity to the RAAM (transistor level) exist
[3], but are impractical for constructing complex systems. In order to avoid this problem,
the RAAM takes advantage of MITEs with 4 gate capacitors in order to consolidate large
circuits into structures that require significantly fewer components (transistors). Four input
capacitors per MITE was chosen because it allows for the construction of complex systems
with relatively few components while also keeping the connectivity needed to reconfigure
the devices from becoming too large. This is because a vertical connection line in the switch
matrix is needed for every input capacitor, as well as the drain of the MITE, if the devices
are to be fully reconfigurable. Currently, current-mode routing is used in the RAAM. While
this does not allow for the broadcasting of signals, it helps to minimize offsets because the
current mirrors used to route signals can be laid out as a single component. In addition,
the 4-input MITEs help to offset the need to broadcast signals because they allow for the
consolidation of multiple networks into a single network.
The RAAM is broken up into four core structures: the global switch matrix, the MITE
35
Configurable Analog Block (MITE CAB), the dynamics unit, and the input bank. The
architecture is illustrated in Figure 4.1. The MITE CAB is a core grouping of MITES and a
shared local switch matrix. The dynamics unit is a collection of first-order low-pass filters
and subtraction units that can extend static functions to dynamic functions. The input bank
is an array of V to I converters. Both the dynamics unit and the input bank are housed in
the specialized CAB. The global switch matrix is a floating gate switch array that connects
all the other units together.
4.1.2 Original MITE CAB
The MITE CAB, shown in Figure 4.2, is made up of 8 MITEs, each with 4 gate capacitors,
connected to a local switch matrix. Of the 8 MITEs, 4 have one gate capacitor implicitly
diode connected; these are referred to as input MITEs. The other 4 MITEs are referred to as
output MITEs. The switch matrix is a full crossbar matrix of floating-gate pFET switches
with the drain and 4 gate capacitors of each MITE connected to its vertical lines. The
horizontal connection lines are used to connect the gate capacitors and drains of the MITEs
in the circuit configuration that is desired. They also connect to the global switch matrix
allowing for the combination of MITE CABs to form larger circuits and connections to be
made to the specialized CAB. The layout of the RAAM is shown in Figure 4.3.
4.1.3 Examples and Results
In order to demonstrate the reconfigurability of the system, several circuits were pro-
grammed onto it and results were taken for each. The first circuit built was a squaring
circuit. In order for the synthesis of the circuit to be possible, a reference current must be
used so that the unit of the result is still amperes [11]. Thus, the actual equation synthesized
is
Iout =
I2in
Ire f
(4.1)
where Ire f effectively sets the unity value of the circuit. The compilation of the circuit onto
the system is shown in Figures 4.4 and 4.5. The circuit schematic is shown in Figure 4.4.
36
Local Switch Matrix
Figure 4.2. Schematic of a MITE CAB. The MITE CAB consists of 4 input MITEs and 4 output MITEs
whose gate capacitors and drains can be connected through a local floating-gate switch
matrix.
Figure 4.3. Layout of the RAAM. The FPAA was fabricated in a 0.5µ process on a 1.5mmx1.5mm die.
37
IoutIrefIin
Figure 4.4. Schematic of the squaring circuit implemented on the RAAM. The colored nodes corre-
spond to Figure 4.5.
In Figure 4.5, the connection lines below both the MITEs and the V-I converters represent
the local switch matrices, while the vertical lines connecting the two, on the left of the
figure, represent the global switch matrix. The colored lines in both figures show the how
the circuit is built using the reconfigurable architecture. In addition, the circles represent
switches that are turned on in order to compile the circuit.
38
V
-I
 1
V
-I
 2
V
-I
 3
V
-I
 4
V
-I
 5
V
-I
 6
Fi
gu
re
4.
5.
E
xa
m
pl
e
of
R
A
A
M
re
co
nfi
gu
re
d
to
im
pl
em
en
ta
sq
ua
ri
ng
ci
rc
ui
t.
T
he
co
lo
re
d
no
de
sc
or
re
sp
on
d
to
Fi
gu
re
4.
4
an
d
th
e
ci
rc
le
sa
tt
he
in
te
rs
ec
tio
n
of
th
eb
us
lin
es
in
di
ca
te
sa
sw
itc
h
th
at
ha
sb
ee
n
tu
rn
ed
on
.T
he
ro
w
of
V-
Ic
on
ve
rt
er
sa
nd
th
ec
ro
ss
ba
rn
et
w
or
k
be
lo
w
it
re
pr
es
en
tt
he
sp
ec
ia
liz
ed
C
A
B
,t
he
cr
os
sb
ar
ne
tw
or
k
on
th
e
le
ft
of
th
e
fig
ur
e
re
pr
es
en
ts
th
e
gl
ob
al
sw
itc
h
m
at
ri
x,
an
d
th
e
ro
w
of
M
IT
E
sa
nd
th
e
cr
os
sb
ar
ne
tw
or
k
be
lo
w
it
re
pr
es
en
ta
M
IT
E
C
A
B
.
39
The results of the squaring circuit for different reference currents are shown in Fig-
ure 4.6. Note that the measured results deviate from the theoretical when the MITEs are
no longer operating in the subthreshold region and thus no longer have an exponential re-
lationship between voltage and current. This is seen most clearly in the squaring circuit
because the output current grows much faster than the input once the input is larger than
the reference current.
The next circuit compiled onto the RAAM was a 2nd-order translinear loop configured
as a multiplier, shown in Figure 3.4. The connections needed to build the loop are shown
in Figure 4.7. The multiplier’s output is given by
Iout =
IaIb
Ic
(4.2)
where Ia, Ib, and Ic are the three input currents. The results of the multiplier are shown in
Figure 4.8. The coefficients of the multiplication, Ib/Ic were chosen over a wide range to
illustrate the versatility of the circuit.
40
0 50 100 150 200 250 300
0
50
100
150
200
250
300
350
400
Input Current (nA)
O
ut
pu
t C
ur
re
nt
 (n
A)
Measured
Theoretical
Figure 4.6. Results of the squaring circuit compiled onto the RAAM for different reference currents.
Reference currents of 50nA, 100nA, 200nA, and 300nA were used.
41
V
-I
 1
V
-I
 2
V
-I
 3
V
-I
 4
V
-I
 5
V
-I
 6
Fi
gu
re
4.
7.
E
xa
m
pl
e
of
th
e
R
A
A
M
re
co
nfi
gu
re
d
to
im
pl
em
en
ta
m
ul
tip
lie
r.
T
he
ci
rc
ui
ts
ch
em
at
ic
is
sh
ow
n
in
Fi
gu
re
3.
4,
ho
w
ev
er
th
e
re
fe
re
nc
e
vo
lta
ge
ha
s
be
en
co
nn
ec
te
d
to
on
e
of
th
e
co
nt
ro
lli
ng
vo
lta
ge
s.
T
he
ci
rc
le
sa
tt
he
in
te
rs
ec
tio
n
of
th
e
bu
sl
in
es
in
di
ca
te
a
sw
itc
h
th
at
ha
sb
ee
n
tu
rn
ed
on
.
42
Next, a circuit that implements a square root function was built. Again, the synthesis of
the circuit, which requires a reference current, can be found in [11]. The synthesis results
in the equation
Iout =
√
IinIre f (4.3)
where Ire f again sets the unity value of the circuit. The results are shown in Figure 4.9. The
measured data is closer to the ideal over a larger input current range than with the squaring
circuit because the MITEs do not leave subthreshold as quickly.
The fourth circuit programmed onto the device used both a square and a square root
function in order to calculate the magnitude of a two-dimensional vector, whose equation
equation is given by
Iout =
√
I2x + I2y (4.4)
where Ix and Iy represent the x-coordinate and the y-coordinate of the vector. Although the
circuit could take the square root of the sum of the two squares by using the previously dis-
cussed circuits, the system can be consolidated onto only 6 MITEs [11]. The initial results
of the vector magnitude circuit are shown in Figure 4.10a. These results were obtained
after programming all 6 MITEs to the same current level. Each MITE was programmed
to have 10nA of current with a source-gate voltage of 1.3V and a source-drain voltage of
2.3V . Note that while the circuit shows the correct behavior, there are gain terms from the
current mirrors that introduce errors into the result, producing
Iout =
√
0.8I2x + 0.8I2y . (4.5)
One advantage of programmable computational elements is that these error terms can be
canceled. By shifting the Vth of the two MITEs that perform the squaring functions higher
than the other MITEs using injection, the coefficients were both increased to 1. The output
of the vector magnitude circuit after programming out the error terms is shown in Figure
4.10b.
Finally, a 1st-order log-domain filter was built. The circuit is essentially the same as
43
0 50 100 150
0
50
100
150
200
250
300
Input Current (nA)
O
ut
pu
t C
ur
re
nt
 (n
A)
Figure 4.8. Results of the 2nd order translinear loop when used as a multiplier. The coefficient of the
multiplication was varied to show the versatility of the circuit.
44
0 50 100 150 200 250 300
0
50
100
150
200
250
300
350
Input Current (nA)
O
ut
pu
t C
ur
re
nt
 (n
A)
Measured
Theoretical
Figure 4.9. Results of the square root circuit for different reference currents. Reference currents of
50nA, 100nA, 200nA, and 300nA were used.
0 50 100 150 200 250
0
50
100
150
200
250
Output*cos( θ) (nA)
Ou
tp
ut
*
sin
(θ)
 (n
A)
Measured
Theoretical
0 50 100 150 200 250
0
50
100
150
200
250
Output*cos( θ) (nA)
Ou
tp
ut
*
sin
(θ)
 (n
A)
Measured
Theoretical
(a) (b)
Figure 4.10. Results of the vector magnitude circuit. a) Results of the vector magnitude circuit after
programming all MITEs to the same level. Each MITE was programmed to have 10nA of
current with a source-drain voltage of 2.3V and a source-gate voltage of 1.3V . b) Results
of the vector magnitude circuit after programming out the initial errors. The MITEs
preforming the squaring functions were injected higher than the other MITEs in order to
increase the coefficients to 1.
45
10
2
10
3
10
4
−25
−20
−15
−10
−5
0
Frequency [Hz]
M
ag
ni
tu
de
 [d
B]
Figure 4.11. Frequency response of the 1st-order log-domain filter for different bias currents. Note that
the limited frequency response of the filter was due to the V-I converter used for the input.
the translinear loop, but a capacitor is added to the drain of one of the MITEs [11]. The
cutoff frequency of the filter is determined by a bias current. The frequency response of the
filter, for different bias currents, is shown in Figure 4.11. Note that the limited frequency
response of the filter was due to the V-I converter used for the input.
4.2 An Improved Translinear FPAA
In order to improve the usability and functionality of the MITE FPAA an improved version
was designed. The new MITE FPAA, called the MFPAA, was again based off of the new
generation of FPAA, called the RASP, designed by Christopher Twigg. This was done to
make sure that the RASP and the MFPAA would be easily interfaced with each other.
46
4.2.1 Improved System Architecture
The architecture of the MFPAA, shown in Fig 4.12 is similar to the original version of the
MITE FPAA but uses a more complex routing scheme in order to reduce the parasitic ca-
pacitance of the switch matrixes. The vertical routing is divided between local only, nearest
neighbor, and global routing while the horizontal routing is limited to global. This stems
from how the shape of the CAB fits into the MFPAA, the array of CABs is 6 tall but only
3 wide. This system allows connections that are more sensitive to parasitic capacitance,
such as those used in filters and other dynamic systems, to be routed using the local routing
without losing much flexibility in the routing fabric. Eighteen CABs were able to fit into
the MFPAA using a 3mmx3mm die. Seventeen of the CABs are MITE CABs while one is
reserved for input and output structures. The input and output structures will be discussed
in later chapters.
The CAB was also redesigned with system-level concerns in mind. First, in order
to allow for the most flexible routing fabric, currents or voltages can be routed to all of
the computation elements. This allows for the use of Kirchhoff’s Current Law (KCL) to
implement addition and subtraction of signals in the current domain. It also allows for the
ability to broadcast signals easily by routing voltages. Second, bias current generators and
current mirrors were included in each CAB. This, combined with the use of local routing,
reduces the amount of routing needed between CABs. This also reduces any error in the
bias currents that were caused by the routing network.
4.2.2 Improved MITE CAB
The most significant changes in the architecture of the MFPAA are within the MITE CAB.
In order to improve the density of computation elements to switch elements, single MITEs
must be replaced with computational blocks with less reconfigurability. In order to avoid
losing flexibility, the new computation element, shown in Figure 4.13, was chosen by trying
to maximize the number of equations the element could implement while minimizing the
reconfigurability needed. A structure in which the output MITE’s gates are reconfigurable
47
MITE
CAB
IO
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
MITE
CAB
Programming Circuitry
P
ro
g
ra
m
m
in
g
 C
ir
cu
it
ry
Figure 4.12. System architecture of the improved MITE FPAA. The FPAA consists of 17 MITE CABs
and a single IO CAB. The vertical routing between CABs is organized into local only,
nearest neighbor, and global. The horizontal routing is global only since there are only
3 columns in the FPAA. Note that the horizontal routing connects to the global vertical
routing.
48
V
cas
V
cas
V
cas
V
cas
V
cas
To Switch Matrix
Figure 4.13. Basic MITE computation element of the improved MITE FPAA. The computation element
consists of 5 input MITEs in a translinear loop configuration and 1 output MITE. The
gates of the output MITE are sent into the switch matrix where they are connected to
any of the input MITE gate voltages through programmable switches. This configuration
gives a large number of implementable equations without the need for a large amount of
internal reconfigurability. In addition, it is possible to use currents or voltages as the inputs
to each MITE; the gate of the nFET is used for voltages (the MITEs drain is left floating)
while the drain of the MITE is used for currents (the gate of the nFET is grounded).
while the input MITE’s gates are fixed allows for this possibility. Five input MITEs were
chosen because it allows for the largest number of equations to be implemented while
remaining practical; increasing the number of input MITES any more would result in a large
number of unused MITEs for most computations. Two of these elements are contained in
each CAB. The CAB also includes a first-order log-domain filter, shown in Figure 4.14.
The filter will be used to synthesize any dynamic functions ranging from simple filters to
complex oscillators. Again, this was done to increase the density of the computational
elements without losing too much reconfigurability. This also lends itself to implementing
the synthesis procedures on the MFPAA. Subramanian’s synthesis procedure implements
dynamic functions by combining static functions with first-order filters. In addition, the
CAB includes six bias current generators and six current mirrors. The bias currents are
49
V
cas
V
cas
To Swtch Matrix
V
cas
C
filter
Figure 4.14. Log-domain filter of the improved MITE FPAA. The MITE FPAA uses a standard
first-order MITE log-domain filter in order to implement dynamic functions. As in the
case of the computation element, the input MITEs can use either voltages or currents as
inputs.
used for implementing coefficients and scaling currents in the input equations. The current
mirrors are used for adding and subtracting as well as signal routing. The layout of the
MFPAA is shown in Figure 4.15.
4.2.3 Examples and Results
In order to the test the MFPAA, a wide range of circuits were compiled onto it. First, some
static functions were tested including circuits for multiplying, squaring, cube root, and
raising to the 2/3 power. Next, dynamic functions were tested. These included a low-pass
filter, a high-pass filter, and an RMS-to-DC converter. The circuits were compiled using
the synthesis procedures developed by Dr. Subramanian.
4.2.3.1 Static Examples
The first static equation compiled onto the MFPAA implemented a 2nd-order translinear
loop. This implements the equation
50
Figure 4.15. Layout of the MFPAA. The FPAA was fabricated in a 0.35µ process on a 3mmx3mm die.
51
Iout =
IaIb
Ic
. (4.6)
In order to test this circuit, Ia was swept while Ib and Ic were held constant. In addition,
Ib/Ic, was set to produce a variety of coefficients. The results are shown in Figure 4.16.
Note that the output current tends to bend in the direction of the line Iout = Iin.
The bending in the previous example is due to κ variation in the the MITE’s output
characteristic. The effective κ of a MITE was measured with respect to its drain current
and the results are shown in Figure 4.17. Note that the effective κ of the MITE moves
more than 30% over the MITE’s standard operating range (up to 500nA). The output of a
multiplication circuit using a 2nd-order translinear loop was calculated using the data found
in Figure 4.17. The results are shown in Figure 4.18. Note that the calculated outputs of
the multiplication circuit show the same behavior, outputs for coefficients less than one
bend upwards and outputs for coefficients greater than one bend downwards, as the actual
results. The calculated results are not smooth due to the noisy κ data.
In order to improve improve the results, the geometric mean of Ib and Ic was kept equal
to the input current. This constraint keeps the κs of the MITEs approximately equal. The
improved results are shown in Figure 4.19.
Next, a squaring circuit was compiled onto the MFPAA. The circuit uses a scaling
current, Is, that determines the value of unity in the system. This idea also seen when
looking at the equation
Iout =
I2in
Is
. (4.7)
which describes the system’s input-output relationship. The results of the squaring circuit
are shown in Figure 4.20. The most important feature of the output characteristic is its
inaccuracy for large input to scaling current ratios. This causes currents larger than the
subthreshold range to flow through the output MITE.
A cube root circuit was also compiled on the MFPAA. The circuit is shown in Figure
4.21. The output MITE of another MITE Computational Element (MCE) is used to gain
52
50 100 150 200 250 300
0
50
100
150
200
250
300
350
400
450
500
Input Current (nA)
O
ut
pu
t C
ur
re
nt
 (n
A)
Coefficient Multiplication using the MCE
 
 
Iout = Iin/4
Iout = Iin/2
Iout = Iin
Iout = Iin*2
Iout = Iin*4
10−8 10−7
10−9
10−8
10−7
10−6
Input Current (A)
O
ut
pu
t C
ur
re
nt
 (A
)
 
 
Iout = Iin/4
Iout = Iin/2
Iout = Iin
Iout = Iin*2
Iout = Iin*4
Figure 4.16. Results of a coefficient multiplication implemented with the MITE Computational Ele-
ment. The results are shown in a linear plot(top) and a logarithmic plot(bottom) to show
both the accuracy and the dynamic range of the computation. Note that the bending in
the logarithmic plot is due to κ variation with respect to the drain current of the MITE.
53
10−10 10−9 10−8 10−7 10−6
0.2
0.22
0.24
0.26
0.28
0.3
0.32
0.34
0.36
0.38
0.4
Drain Current (A)
Ef
fe
ct
ive
   
κ
κ Variation in a MITE
Figure 4.17. Variation of the effective κ over the standard operation range of a MITE is shown. κ was
calculated by doing a numerical logarithmic derivative. Note that over the operating range
of the MITE, a 30% change is seen.
54
10−10 10−9 10−8 10−7 10−6
10−12
10−11
10−10
10−9
10−8
10−7
10−6
10−5
Input Current (A)
O
ut
pu
t C
ur
re
nt
 (A
)
Theoretical Multiplication Output with κ Variation
 
 
I
out = Iin/4
I
out = Iin/2
I
out = Iin
I
out = 2Iin
I
out = 4Iin
Figure 4.18. Theoretical response of a 2nd-order loop including κ variation. The ideal MITE character-
istic was altered to include a variable κ based on the data shown in Figure 4.17. The new
characteristic was used to calculate the output of the translinear loop. Note that the error
produced by the variable κ shows the same behavior as the data in Figure 4.16. Specifi-
cally, for coefficients less than 1 the output bends upwards and for coefficients greater than
1 the output bends downwards.
55
50 100 150 200 250 300
0
50
100
150
200
250
300
350
400
450
500
Input Current (nA)
O
ut
pu
t C
ur
re
nt
 (n
A)
Improved Coefficient Multiplication using the MCE
 
 
Iout = Iin/4
Iout = Iin/2
Iout = Iin
Iout = Iin*2
Iout = Iin*4
10−8 10−7
10−9
10−8
10−7
10−6
Input Current (A)
O
ut
pu
t C
ur
re
nt
 (A
)
 
 
Iout = Iin/4
Iout = Iin/2
Iout = Iin
Iout = Iin*2
Iout = Iin*4
Figure 4.19. Results of the improved coefficient multiplication implemented with the MITE Computa-
tional Element. The results are shown in a linear plot(top) and a logarithmic plot(bottom)
to show both the accuracy and the dynamic range of the computation. Note that the bend-
ing in the logarithmic plot is greatly reduced by keeping the current levels of the MITEs
approximately equal, reducing the effect of κ variation.
56
50 100 150 200 250 300
0
50
100
150
200
250
300
350
400
450
500
Input Current (nA)
O
ut
pu
t C
ur
re
nt
 (n
A)
Current Squaring using the MCE
 
 
Is = 10nA
Is = 23.4nA
Is = 54.8nA
Is = 128.2nA
Is = 300nA
10−8 10−7
10−10
10−9
10−8
10−7
10−6
Input Current (A)
O
ut
pu
t C
ur
re
nt
 (A
)
 
 
Is = 10nA
Is = 23.4nA
Is = 54.8nA
Is = 128.2nA
Is = 300nA
Figure 4.20. Results of a squaring circuit implemented with the MITE Computational Element. The
results are shown in a linear plot(top) and a logarithmic plot(bottom) to show both the
accuracy and the dynamic range of the computation. Note that the inaccuracy at high
output currents is due to devices leaving subthreshold operation.
57
V
cas
V
cas
V
cas
V
cas
V
cas
V
cas
V
cas
I
in
I
s
I
out
Figure 4.21. Circuit which implements a cube root on the MFPAA. A second output MITE, from the
other MCE in the CAB, is used to gain access to the output current. In addition, a current
mirror is used to feed back the output current to create the cube root.
access to the output current. Again, a scaling current is used set the value of unity in the
system. The equation that describes the system is
Iout = I
1/3
in I
2/3
s . (4.8)
The results of the cube root are shown in Figure 4.22. In contrast to the squaring circuit,
the cube root results are more accurate because of its compressive nature.
In addition to the cube root, the same circuit computes I2/3in . This can be accomplished
by reversing the roles of Iin and Is in the previous circuit. The results of this new circuit are
shown in Figure 4.23. Again, the circuit is more accurate due to its compressive nature.
4.2.3.2 Dynamic Examples
The first dynamic circuit compiled onto the MFPAA was a first-order low-pass filter. The
filter is included as one of the CAB components on the MFPAA. The filter was tested
58
50 100 150 200 250 300
0
50
100
150
200
250
300
Input Current (nA)
O
ut
pu
t C
ur
re
nt
 (n
A)
Results of a Cube Root using the MCE
 
 
Is = 10nA
Is = 23.4nA
Is = 54.8nA
Is = 128.2nA
Is = 300nA
10−8 10−7
10−9
10−8
10−7
10−6
Input Current (A)
O
ut
pu
t C
ur
re
nt
 (A
)
 
 
Is = 10nA
Is = 23.4nA
Is = 54.8nA
Is = 128.2nA
Is = 300nA
Figure 4.22. Results of a cube root circuit on the MFPAA. The results are shown in a linear plot(top)
and a logarithmic plot(bottom) to show both the accuracy and the dynamic range of the
computation. Note that the results are more accurate than the squaring circuit because of
the compressive nature of the cube root.
59
50 100 150 200 250 300
0
50
100
150
200
250
300
Input Current (nA)
O
ut
pu
t C
ur
re
nt
 (n
A)
Results of I
out = Iin
2/3
 
 
Is = 10nA
Is = 23.4nA
Is = 54.8nA
Is = 128.2nA
Is = 300nA
10−8 10−7
10−9
10−8
10−7
10−6
Input Current (A)
O
ut
pu
t C
ur
re
nt
 (A
)
 
 
Is = 10nA
Is = 23.4nA
Is = 54.8nA
Is = 128.2nA
Is = 300nA
Figure 4.23. Results of a Iout = I2/3in circuit on the MFPAA. The results are shown in a linear plot(top)
and a logarithmic plot(bottom) to show both the accuracy and the dynamic range of the
computation. Note that the results are more accurate than the squaring circuit because of
the compressive nature of the cube root.
60
103 104 105
−60
−50
−40
−30
−20
−10
0
10
Frequency (Hz)
O
ut
pu
t M
ag
ni
tu
de
 (d
B)
Frequency Response of Low−pass Filter Implemented on the MFPAA
 
 
Iτ=3nA
Iτ=6nA
Iτ=12nA
Iτ=23nA
Iτ=41nA
Figure 4.24. The transfer function of a first-order low-pass filter for various bias currents is shown.
The bias currents used were logarithmically spaced between 3nA and 41nA. Note that the
highest achievable corner frequency is approximately 75KHz.
by adjusting the bias currents that set the corner frequency of the filter and measuring
the transfer function. The results are shown in Figure 4.24. The responses shown are
for 5 logarithmically spaced bias currents between 3nA and 41nA. Note that the highest
achievable corner frequency is approximately 75KHz.
Next, a first-order high-pass filter was compiled onto the MFPAA. The filter was built
by subtracting a low-pass filtered version of the input from the original signal. Again, the
filter was tested by measuring the transfer function for multiple bias currents. The bias
currents are logarithmically spaced between 4nA and 106nA. The frequency response of
the entire system is more apparent here than in the low-pass filter case. The results are
shown in Figure 4.25.
An RMS-to-DC converter was also compiled onto the MFPAA next. A combination
of three static and dynamic circuits, in addition to the VI converter, are needed in order to
61
103 104 105
−60
−50
−40
−30
−20
−10
0
10
Frequency (Hz)
O
ut
pu
t M
ag
ni
tu
de
 (d
B)
Frequency Response of High−pass Filter Implemented on the MFPAA
 
 
Iτ=4nA
Iτ=7nA
Iτ=14nA
Iτ=28nA
Iτ=54nA
Iτ=106nA
Figure 4.25. The transfer function of a first-order high-pass filter for various bias currents is shown.
The bias currents used were logarithmically spaced between 4nA and 106nA. Note that the
frequency response of the overall system begins to affect the higher frequencies.
62
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Vpp of Input Sinusoid
O
ut
pu
t C
ur
re
nt
 (n
A)
RMS to DC Converter Output
 
 
Measured Data
Linear Fit
Figure 4.26. Output characteristic of the RMS-to-DC converter is shown. The amplitude of the input
sinusoid was swept from 0.1−4.5V . The frequency of the input was held constant at 500Hz.
realize the converter. First, the input, which has been rectified by the input VI structure, is
squared. Second, it is passed through a low-pass filter to find the mean. Third, the square
root of the mean is found. The converter was tested by varying the input amplitude of a
sine wave and measuring the output current. The results are shown in Figure 4.26.
4.2.3.3 A Sinusoidal Oscillator with Independent Frequency and Amplitude Control
In order to demonstrate a more complex system, a sinusoidal oscillator with independent
frequency and amplitude control was compiled onto the MFPAA. The synthesis of the
circuit is described in [16]. Some of the smaller components of the system were tested
individually before the final compilation to make debugging easier.
The oscillator can be broken down into three main components, translinear loops for
calculating static equations, first-order low-pass filters, and current splitters. Since both
the translinear loops and filters have already been tested, the current splitter is the only
component that was tested separately. The oscillator uses to current splitters to allow for
63
bidirectional signals of both the x and y signals. Current splitters are used to create a
differential signal from a single-ended input signal. This provides the advantage of allowing
for both differential and bidirectional circuits to be built. Figure 4.27 shows the topology
of the current splitter. The current splitter uses a geometric mean constraint on its output
currents to control the common-mode output levels, described as
Iout1Iout2 = I2dc (4.9)
where Idc sets the geometric mean of the output currents. In addition, by KCL,
Iin = Iout1 − Iout2. (4.10)
The results of the current splitter are shown in Figure 4.28. Note that the geometric mean of
the two output currents stays nearly constant, changing only 3nA over the entire operating
range. It is not important for the geometric mean to be exactly constant as long as the
difference of the output currents is equal to the input current. This is guaranteed through
KCL.
Once the individual components of the oscillator were tested, the entire system was
compiled onto the MFPAA. The result of the compilation is shown in Figure 4.29. Note
that while the oscillator uses 10 of the 17 MITE CABs on the MFPAA, only 70 of 272
MITEs were used. In addition, the oscillator required 199 switches be turned on. It should
also be noted that most of the routing required global lines, as there were a large percentage
of shared signals between the computational elements. Unfortunately, the oscillator did not
function properly once compiled and is still be investigated. The oscillator output a constant
current for Ix and no current for Iy.
64
Iin
Idc
Iout1
Iout2
Vcas
Figure 4.27. Topology of the geometric mean current splitter.
65
−100 −80 −60 −40 −20 0 20 40 60 80 100
0
20
40
60
80
100
120
Input Current (nA)
Si
ng
le
−e
nd
ed
 O
ut
pu
t C
ur
re
nt
 (n
A)
Results of the Geometric Current Splitter
 
 
Iout+
Iout−
−100 −80 −60 −40 −20 0 20 40 60 80 100
−100
−50
0
50
100
Input Current (nA)
D
iff
er
en
tia
l O
ut
pu
t C
ur
re
nt
 (n
A)
 
 
Iout+−Iout−
Iin
−100 −80 −60 −40 −20 0 20 40 60 80 100
9
9.5
10
10.5
11
11.5
12
12.5
Input Current (nA)
G
eo
m
et
ric
 M
ea
n 
of
 O
ut
pu
t C
ur
re
nt
s 
(nA
)
Figure 4.28. The results from a geometric current splitter built on the MFPAA are shown.
66
0 20 40 60 80 100 120
0
50
100
150
200
250
300
350
(3,22)
(17,22)
(4,23)
(30,23)
(17,24)
(5,24)
(30,25)
(6,25)(7,26)
(28,26)
(8,27)
(41,27)
(28,28)
(9,28)
(41,29)
(10,29)
(18,40)
(27,40)(29,40)
(31,40)
(40,40)(42,40)
(31,41)
(40,41)(42,41)
(18,41)
(27,41)(29,41)
(20,0)
(19,1)
(32,1)
(19,1)
(32,1)
(2,1)
(24,1)
(17,2)
(30,2)
(17,2)
(30,2)
(1,2)
(22,2)
(17,3)
(19,3)
(5,3)
(30,4)
(32,4)
(6,4)
(28,5)
(41,5)
(28,5)
(41,5)
(50,5)
(2,5)
(17,6)
(19,6)
(3,6)
(21,7)
(34,7)
(21,7)
(34,7)
(26,7)
(30,8)
(32,8)
(4,8) (7,34)
(11,34)(8,35)(12,35)
(41,36)
(50,36)
(6,36)
(41,36)
(51,36)
(9,36)(13,36)
(10,37)
(28,37)
(51,37)
(28,37)
(50,37)
(10,37)(14,37)
(18,38)
(27,38)
(18,38)
(27,38)
(22,39)
(29,39)
(22,39)
(29,39)
(27,39)
(55,39)
(31,40)
(40,40)
(31,40)
(40,40)
(33,40)
(40,40)(42,40)
(33,40)
(40,40)(42,40)
(25,40)
(57,40)
(35,41)
(42,41)
(35,41)
(42,41)
(20,41)
(27,41)(29,41)
(20,41)
(27,41)(29,41)
(23,41)
(59,41)
(47,1)
(47,1)
(2,1)
(47,1)
(47,1)
(45,2)
(45,2)
(45,2)
(2,2)
(45,2)
(30,3)
(3,3)
(49,3)(50,3)
(17,4)
(49,4)(52,4)
(4,4)
(30,5)
(49,5)(50,5)
(5,5)
(17,6)
(49,6)(52,6)
(6,6)
(10,7)
(28,7)
(51,7)
(20,7)
(6,8)
(28,8)
(51,8)
(20,8)(17,9)(21,9)
(1,9)
(17,9)(21,9)
(19,27)
(28,27)
(43,27)
(56,27)
(43,27)
(41,27)
(54,27)
(43,28)
(41,28)
(54,28)
(19,28)
(28,28)
(43,28)
(52,28)
(41,39)
(50,39)
(20,39)
(42,39)
(41,39)
(50,39)
(20,39)
(42,39)
(31,40)
(40,40)(42,40)
(18,40)
(27,40)
(40,40)
(31,40)
(40,40)(42,40)
(18,40)
(27,40)
(40,40)
(18,41)
(27,41)(29,41)
(22,41)
(29,41)
(18,41)
(27,41)(29,41)
(22,41)
(29,41)
Column
R
o
w
Compilation of a Sinusoidal Oscillator onto the MFPAA
Figure 4.29. Compilation of a sinusoidal oscillator with independent frequency and amplitude control
onto the MFPAA. Not that oscillator used 10 of the 17 CABs on the MITE FPAA.
67
CHAPTER 5
FROM EQUATION TO HARDWARE: THE SOFTWARE IN
BETWEEN
In order to effectively utilize the improved translinear FPAA, a software chain was de-
veloped. The collective purpose of this chain was to implement, in hardware, the equation
entered by the user. The main components of the chain are network synthesis, place-and-
route, and programming. In addition, a GUI was created to allow the user to review the
mapped function and make changes if necessary. While this work was meant to show how
a translinear FPAA can be used to both simplify the user interface and abstract the design
work away from the user, the software created was not meant to exhibit the ideal solutions.
Furthermore, while it is possible to create a true synthesis procedure for dynamic functions
as well, only static equations were considered for synthesis.
5.1 Network Synthesis
The first step in the software chain is the synthesis of a circuit topology from the input
equation. This topic was explored thoroughly by Dr. Subramanian in [16]. Translinear
elements can be used to synthesize any linear, polynomial or rational function, and thus
any irrational function must be approximated in order to realize it. MATLAB was chosen
as the user interface for this software package since it is widely used in laboratory settings.
In addition, the symbolic toolbox contained within MATLAB was utilized to make some
of the symbolic manipulation simpler.
In order to take advantage of Dr. Subramanian’s work, a set of MATLAB functions
were written to parse the input equation into modules capable of being processed by the
MITE Computational Elements (MCE). First, the expression is prepared for parsing by
expanding it using MATLAB’s symbolic toolbox. Since expanding the expression blindly
may not lead to optimal use of components in the FPAA, an option for the user to create
68
[x+y]2
Var12
(x+y)2
x2+2xy+y2
x2     2xy     y2 x     y
Var1
Figure 5.1. A representation of how equations are parsed for use in the MITE FPAA. Equations are
split at addition and substraction signs to create units that will be implemented by MITE
Computational Elements. The user’s expression is expanded first in order to create a simple
parsing tree(shown on the left). However, the user can define sub-blocks by using brackets
to replace an expression with an intermediate variable(shown on the right).
sub-blocks was included. This is done by using ‘[’ and ‘]’ instead of parenthesis while
entering the equation. Anything included in brackets is treated as its own expression and
is replaced by a new variable in the original expression. Once expanded, each expression
is split at the + and − signs in order to break it into units containing only multiplication,
division, and powers. These ideas are illustrated in Figure 5.1.
Now that expressions containing only multiplication, division, and powers have been
obtained, a few special cases must be checked for and taken care of. The first of these
cases is an expression that contains fractional exponents. Since MITEs with only two gate
capacitors can only implement powers with magnitudes of 1 or 2, the final expression that
will be implemented can only have integer exponents. This is accomplished by raising
the expression to the lowest integer power that will result in all integer exponents. While
the new expression is now capable of being implemented, the output now has an exponent
other than one. To correct this, the output signal will be fed back to produce an equation
that results in the intended output. An example of this process is shown here:
Iout = I
1/2
1 I
1/4
2 I
1/4
3 =⇒ I4out = I21 I2I3 =⇒ Iout =
I21 I2I3
I3out
. (5.1)
Another common case that must be checked for is expressions that do not result in an
69
output with the correct units, amperes. This is similar to the above case in that the output
of the expression must have an exponent of 1. In this case, a scaling current is added to
the expression to account for this. The scaling current will have a constant value and will
set the value of 1 for the system. This means squaring inputs with the same value as the
scaling current will result in the same value. An example of this process is shown here:
Iout =
I31 I2
I23
=⇒ Iout =
I31 I2
I23 Is
. (5.2)
In addition, coefficients must be found and included. This is done breaking each ex-
pression at every symbol (there is an implicit multiplication sign between every term that
is being multiplied) and checking for numbers. Any numbers found that do not follow a
‘∧’ are assumed to be coefficients. The coefficients are included in the synthesis by adding
two scaling currents to the expression. The ratio of these currents represents the coefficient
multiplication. For instance,
Iout =
3I1I2
I3
(5.3)
is implemented as
Iout =
I1I2Is1
I3Is2
(5.4)
where
Is1
Is2
= 3. (5.5)
Once functions capable of being implemented with the MITEs are obtained, Dr. Subra-
manian’s work can be leveraged to map the functions onto a MCE. As described in [16], the
fixed gate connections of the 5 input MITEs contained in each MCE produces a set pattern
in the exponents of the expression implemented. This pattern can be altered by changing
where the gates of the output MITE are connected. The possible patterns are shown in
Table 5.1. Exponents with a magnitude greater than two must be realized by connecting
the input signal to multiple MITEs. For example,
Iout =
I31 I
2
2
I43
(5.6)
70
must be implemented as
Iout =
I1I21 I
2
2
I23 I
2
3
. (5.7)
In addition, expressions that cannot be implemented in a single MITE Computation Ele-
ment, must be broken up into multiple elements. For example,
Iout =
I41 I
3
2
I53 I4
(5.8)
must be implemented as
Iout =
ItempI22
I3I4
(5.9)
where
Itemp =
I21 I
2
1 I2
I23 I
2
3
. (5.10)
Table 5.1. Exponent Patterns Generated with Different Gate Connections
Gate Connections Input Exponent Pattern
1, 3 +1, -1, +1, 0, 0
1, 5 +1, -1, +1, -1, +1
2, 2 -1, +2, 0, 0, 0
2, 4 -1, +2, -1, +1, 0
3, 3 +1, -2, +2, 0, 0
3, 5 +1, -2, +2, -1, +1
4, 4 -1, +2, -2, +2, 0
5, 5 +1, -2, +2, -2, +2
While the MITE elements realize the multiplication, division, and powers found in the
user’s expression, addition and subtraction must still be included. This is done through
the use of KCL. Intermediate expressions that are added together are summed by simply
connecting the current-mode output of each expression’s respective output MITE together.
Similarly, subtraction is accomplished by connecting the appropriate output MITEs to dif-
ferent sides of a current mirror. This can also be described as changing the sign of the
signal being subtracted and then adding it to the positive terms of the expression.
71
The synthesis function creates a list of MCEs, current mirrors, and variables that will
be passed to the place-and-route function. An example of this output for the expression√
x2 + y2 (5.11)
is shown in Figure 5.2. Note that each entry in Circuit in f o.loops represents a MCE, with
the first 5 entries representing which signals are applied to the input MITEs and in what
order (0 means no signal is connected). The last entry in Circuit in f o.loops is the variable
number associated with the output signal of that element. Circuit in f o.gates stores which
pattern to connect the gates of the output MITE in. Circuit in f o.mirrors is similar to
Circuit in f o.loops in that all of the entries, except for the last one in each row (which is
the name of the mirror’s output), represent which signals are connected to each mirror. A
signal is applied to the input of the current mirror unless its number is shown as a negative
number in which case it is connected to the mirrors output, creating subtraction. A list of
variables, with attributes that will be used by the place-and-route function, and a list of
variable names is also stored.
5.2 Place-and-route
Place-and-route algorithms are an area of active research in both FPGAs [17–19] and
FPAAs [20, 21] . Again, the simple algorithm used here is meant to show the possibil-
ities of using a translinear FPAA in simplifying the software algorithms needed. In ad-
dition, the algorithm shown here, while easily extendable, has only been created for one
column of the MFPAA. The algorithm, which uses the output of the synthesis function,
Circuit in f o, to create another structure called Map in f o, can be broken into two distinct
functions–placement of the components used and routing of the signals between them. The
place-and-route algorithm also modifies Circuit in f o in order to remove unnecessary com-
ponents or add components necessary for interfacing signal types that are not compatible.
For instance, current mirrors with only one input can be removed since no addition or sub-
traction is taking place. If a current mirror is needed for routing between two components,
72
>> Circuit_info
Circuit_info = 
              mirrors: [2x3 double]
                gates: [3x2 double]
                coeff: [1 1 1]
                loops: [3x6 double]
                 vars: [9x6 double]
           expression: '[x^2+y^2]^(1/2)'
    SolvedExpressions: [2x10 char]
              varlist: [3x4 char]
>> Circuit_info.loops
ans =
     1     5     4     0     0     5
     6     2     0     0     0     7
     8     3     0     0     0     9
>> Circuit_info.gates
ans =
     1     3
     2     2
     2     2
>> Circuit_info.mirrors
ans =
     5     0     0
     9     7     1
>> Circuit_info.vars
ans =
     1     0     0     0     0     0
     2     1     0     0     0     0
     3     1     0     0     0     0
     4     0     0     1     0     1
     5     0     0     0     1     0
     6     0     0     1     0     1
     7     0     0     0     1     0
     8     0     0     1     0     1
     9     0     0     0     1     0
>> Circuit_info.varlist 
ans =
Var1
x   
y 
Figure 5.2. The output of the network synthesis function for a vector magnitude equation. The function
returns a structure, Circuit in f o, that contains lists of MITE Computation Elements, shown
in Circuit in f o.loops, gate connections for those elements, shown in Circuit in f o.gates, cur-
rent mirrors, shown in Circuit in f o.mirrors, and variables, shown in Circuit in f o.vars. In
addition, a list of variable names is stored in Circuit in f o.varlist. These names correspond
to the first variables in Circuit in f o.vars. Other information is also stored but is not shown
here.
73
it will be added during the routing function.
The placement function breaks the input structure into five main categories–inputs, out-
puts, loops, scaling currents, and mirrors. They are placed in the order shown. Input and
output placement is very simple, as there are a bank of VI converters for the inputs and
output drivers for the outputs. A simple search for those not being used is performed and
all inputs and outputs are placed accordingly. The loops are placed in a similar manner.
Starting in the CABs closest to the Input/Output CAB, the loops are assigned to MITE
Computation Elements until each entry in Circuit in f o.loops is accounted for. Scaling
currents are then placed in the same CABs in which the loops they appear in have been
placed. This function uses Circuit in f o.vars, which contains a field which defines if a cer-
tain variable is a scaling current. Lastly, the current mirrors are placed in CABs in which
at least one of the inputs or outputs is connected to.
While each element is being placed, the output of the place-and-route function, Map in f o,
is being populated. Map in f o is a structure which contains an array of structures repre-
senting each CAB. Each CAB structure in turn contains lists of loops, mirrors, and scaling
current biases. These lists are filled with the variable number that represents the output
of each structure. For example, Map in f o.cab(5, 2).loops stores the two variables which
represent the output of the two MITE Computation Elements in CAB(5,2).
After the placement function has successfully completed, the signals are routed between
the placed components. The routing function breaks this task up into five smaller tasks–
the routing of the MITE gate capacitors, MITE inputs, current mirrors, and output. The
MITE gate capacitors are the simplest routing in the system. Using the information stored
in Circuit in f o.gates, each gate capacitor of the output MITE is connected to the correct
controlling voltage, the voltage generated by the diode connected gate capacitors of the
input MITEs. A maximum of two switches per gate capacitor is needed to do this.
The routing of the rest of the signals first requires a discussion of whether signals are
broadcast or not. If a signal is routed to multiple places on the FPAA, it is said to be
74
broadcast. This requires the signal to be routed as a voltage, as currents require copies to
be made if they are to be sent to multiple places. The voltage representing the signal is
actually the diode connected voltage of a current mirror. The MITEs which make up the
core of the system can accept either a voltage or a current as their input. If current is to be
used, the nFET used for routing voltages must be grounded. If voltage is to be used, the
line connected to the drain of the MITE used for routing currents must be left floating. This
is shown in Figure 4.13. Also note that it is possible to add two signals together by using
both the current and voltage inputs to a single MITE.
The routing of the MITE inputs can be broken into further sub-categories of signals
coming from input VIs, scaling current biases, loop outputs, and current mirror outputs.
Inputs from VIs are routed on vertical global routing by default, requiring two switches to
route for non-broadcast signals or a maximum of 5 switches to route for broadcast signals.
Three of the five switches needed for broadcast signals only need to be included once,
meaning only two switches are required for all other occurrences of that input signal. Inputs
from scaling current biases are also simple to route, as they are contained in the same CAB.
Routing a bias current to a MITE input requires a current mirror either for reversing the
polarity of the signal, or for creating the diode connected voltage if it is broadcast. This
requires a maximum of 4 switches for non-broadcast signals, and 3 switches for broadcast
signals. The routing of loop outputs to MITE inputs is similar to that of scaling current
biases, except that the loop producing the signal may be in a different CAB than the MITE
it is an input to. A check is done to see if this is the case and local, nearest neighbor vertical,
or global vertical routing is used accordingly. Again, a maximum of either 4 or 3 switches
is required for non-broadcast and broadcast signals respectively. Lastly, current mirror
outputs are routed to the MITE inputs. A similar process to how loop outputs are routed
is used. However, if the output of a current mirror must be broadcast, it must be passed
through a MCE, by multiplying it by 1, before this is possible. The information stored
in Circuit in f o is modified to make these changes before the place-and-route algorithm
75
begins.
Once the MITE inputs are routed, the current mirror inputs are routed in similar fash-
ion. Since these signals are currents by default, they cannot be broadcast. Again, this is
checked before the place-and-route algorithm begins and the circuit information is modified
to account for this. Routing a current mirror input requires only two switches.
Finally, the output is routed. The output can come from either a MCE or a current
mirror. If it comes from a MITE, the output current is routed directly to the output driver
using nearest neighbor vertical routing. This is possible because the output is placed in a
CAB next to the input/output CAB during placement. If the output comes from a current
mirror it can also be routed directly, but the inputs to the mirror must be altered. This is due
to the output driver sinking a current, rather than sourcing a current like the MITEs. Thus,
if the output of the system comes from a current mirror, the signals routed to the input and
output of the mirror are flipped during the routing process.
A sample of the routing functions output structure, Map in f o, is shown in Figure 5.3.
Map in f o stores a great deal of information including what signals are routed on which
lines of the switch matrix, which expression each loop is implementing, and which scal-
ing current bias or current mirror is assigned to which signal. However, the most impor-
tant information stored is the list of switches to turn on, found in Map in f o.switches,
and which computational elements need to be programmed and to what level, found in
Map in f o.prog. The documentation used for routing signals is shown in Figures 5.4-5.7.
76
>> Map_info
Map_info = 
      num_loops: 4
      num_mirrors: 1
      num_inputs: 3
      switches: [2x44 double]
      global: [1x3 struct]
      cab: [6x3 struct]
      outputs: [8 0 0 0 0 0 0 0 0 0]
      inputs: [0 0 0 0 0 0 0 4 3 2]
      prog: [3x30 double]
      loops: [4x5 double]
      mirrors: [5 2 3]
      biases: [2x4 double]
>> Map_info.cab(5,2)
ans = 
      local: [0 0 0 0 7 9 10 99 99 99]
      nnv_up: [0 0 0 0 0 0 0 0 0 8]
      nnv_down: [0 0 0 0 0 0 0 6 1 5]
      df: 1
      loops: [8 7]
      filter: 0
      mirrors: [10 9 1 0 0 0]
      biases: [10 9 0 0 0 0]
      loop1: 'I5^(1)*I9^(1)*I10^(-1)'
      loop2: 'I3^(2)*I4^(-1)'
Figure 5.3. The output of the place-and-route function is shown for a vector magnitude circuit. The
function returns a structure, Map in f o, which contains a list of switches and biases to be
programmed, found in Map in f o.switches and Map in f o.prog respectively. The biases con-
sist of scaling current biases and MITE programming levels. Map in f o is shown on the
right. The structure also contains an array of structures representing each CAB. The struc-
ture of CAB(5,2) is shown on the right. Each CAB structure contains fields showing which
signals are routed on each column of local, nearest neighbor up, and nearest neighbor down
vertical routing. Global routing is stored in Map in f o.global. In addition, each CAB struc-
ture contains fields showing which loops and mirrors are implemented in the CAB, and
what expressions the loops are realizing.
77
G
lo
b
a
l 
(1
0
)
N
e
a
re
s
t 
N
e
ig
h
b
o
r 
V
e
rt
ic
a
l 
(1
0
) 
N
e
a
re
s
t 
N
e
ig
h
b
o
r 
V
e
rt
ic
a
  
(1
0
)
l 
L
o
c
a
l 
(1
0
)
 
P
o
w
e
r 
&
 G
n
d
Global Horizontal (10)
Bridge(1)
C
A
B
(0-41)
 1
1
-6
1
(42-45)
(1-10)
(0)
Routing in a typical CAB
Drain Line Index
G
a
te
 L
in
e
 In
d
e
x
The floating-gate bridge is replaced by an additional row of global horizontal
routing in the bottom row of CABs.  This leaves the overall number of rows
equal for all CABs.
Figure 5.4. Routing diagram of the MFPAA CAB.
78
(0,0)
df
(62,0)
uf
(124,0)
df
(186,0)
uf
(248,0)
df
(310,0)
uf
(0,46)
df
(62,46)
uf
(124,46)
df
(186,46)
uf
(248,46)
df
(310,46)
uf
(0,493)
df
(62,93)
uf
(124,93)
df
(186,93)
uf
(248,93)
df
(310,93)
uf
CAB Offsets and distictions
(row,column) of first element in CAB -- corresponds to (0,0) in CAB routing diagram 
neigherest neighbor vertical routing:  uf = up first, df = down first
I/O CAB
Figure 5.5. CAB offsets of the MFPAA CABs.
79
Vg<6>
Row<11>
Row<50>
Row<51>
Row<52>
Row<53>
Row<54>
Row<55>
Row<56>
Row<57>
Row<58>
Row<59>
Row<60>
Row<61>
Vg<8>
Row<12>
Vg<10>
Row<13>
Vg<12>
Row<14>
Vg<14>
Row<15>
Vg<16>
Row<16>
In_V<1>
In_I<1>
In_V<2>
In_I<2>
In_V<3>
In_I<3>
In_V<4>
In_I<4>
In_V<5>
In_I<5>
Gate<1>
Gate<2>
Iout
M
IT
E
 C
o
m
p
u
ta
ti
o
n
a
l E
le
m
e
n
t
C
a
sc
o
d
e
 V
o
lt
a
g
e
G
e
n
e
ra
to
r
Row<17>
Row<18>
Row<19>
Row<20>
Row<21>
Row<22>
Row<23>
Row<24>
Row<25>
Row<26>
Row<27>
Row<29>
Row<28>V
g
a
te
<
1
:6
>
V
d
ra
in
V
d
ra
in
V
g
a
te
<
1
:2
>
V
g
a
te
<
1
:4
>
V
d
ra
in
V
d
<
4
3
>
V
d
<
4
2
>
V
g
<
1
9
,2
1
,2
3
,2
5
,2
7
,2
8
>
V
g
<
2
,4
>
In_V<1>
In_I<1>
In_V<2>
In_I<2>
In_V<3>
In_I<3>
In_V<4>
In_I<4>
In_V<5>
In_I<5>
Gate<1>
Gate<2>
Iout
M
IT
E
 C
o
m
p
u
ta
ti
o
n
a
l E
le
m
e
n
t
Row<30>
Row<31>
Row<32>
Row<33>
Row<34>
Row<35>
Row<36>
Row<37>
Row<43>
Row<44>
Row<45>
Row<46>
Row<47>
Row<48>
Row<49>
Row<38>
Row<39>
Row<40>
Row<42>
Row<41>V
g
a
te
<
1
:6
>
V
d
ra
in
V
d
<
4
4
>
V
g
<
3
2
,3
4
,3
6
,3
8
,4
0
,4
1
>
V
d
<
4
5
>
V
g
<
4
5
,4
7
,4
9
,5
0
>
A
ll
 b
ia
s 
cu
rr
e
n
t 
g
e
n
e
ra
to
rs
u
se
d
 V
d
<
4
2
>
 f
o
r 
p
ro
g
ra
m
m
in
g
.
Ix_V
Ix_I
It1_V
It1_I
It2_V
It2_I
Iout
M
IT
E
 F
il
te
r
A
ll
 r
o
w
 a
n
d
 c
o
lu
m
n
 in
d
e
xe
s
a
re
 r
e
fe
re
n
ce
d
 t
o
 t
h
e
 C
A
B
sw
it
ch
 m
a
tr
ix
.
Fo
r 
ca
sc
o
d
e
 g
e
n
e
ra
to
r:
V
g
<
2
>
 s
h
o
u
ld
 b
e
 p
ro
g
ra
m
m
e
d
 t
o
 m
a
xi
m
u
m
 M
IT
E
 c
u
rr
e
n
t
V
g
<
4
>
 s
h
o
u
ld
 b
e
 p
ro
g
ra
m
m
e
d
 t
o
 1
/1
5
 o
f 
m
a
xi
m
u
m
 M
IT
E
 C
u
rr
e
n
t
Fi
gu
re
5.
6.
Pr
og
ra
m
m
in
g
in
fo
rm
at
io
n
fo
r
el
em
en
ts
in
th
e
M
FP
A
A
C
A
B
.
80
5.3 Hardware Programming
The last major function in the software chain is hardware programming. While program-
ming floating-gate transistors has been discussed in previous chapters, functions have been
added to make interfacing with an FPAA much easier. Most importantly, a GUI has been
created to show the output of the synthesis and place-and-route functions. This GUI shows
the FPAA and draws the switches which will be turned on and the connections between
them. It also includes diagrams of the CABs so the user can easily understand what is
being connected. A sample of the GUI is in Figure 5.9.
In addition to allowing the user to easily understand how the equation is being imple-
mented on the FPAA, the GUI also allows the user to modify the implementation if they
desire. The GUI allows the user to click on any [row,column] pair and either add a new
switch or delete a switch if one already exists. The GUI is updated after every click. Fur-
thermore, if the user would like to add switches that are programmed as biases, creating
resistive elements, they may enter a target programming value in the designated field and
then use the GUI in the same way. When the user is finished, clicking the ’Done’ button
outputs the final list of switches and programmable elements into the workspace for the
programming code to use.
Once the equation has been synthesized and routed, the list of switches and programmable
elements are programmed on the chip. The setup that allows for this to happen includes
a printed circuit board (PCB), a microcontroller, and a computer for communication. A
picture of the setup is shown in Figure 5.10. Routines for selecting devices, programming
switches, and programming computational elements are stored on the microprocessor and
initiated by communication for the the computer. The computer communicates, over either
serial or USB, directly from MATLAB allowing easy interfacing between the synthesis,
place-and-route, and programming code.
81
V
cas
V
cas
V
cas
V
cas
V
cas
V
cas
V
cas
V
cas
C
filter
C
filter 
= 1pF
In
_
V
<
1
>
Ix
_
V
Ix
_
I
It
1
_
V
It
1
_
I
It
2
_
V
It
2
_
I
In
_
I<
1
>
In
_
V
<
2
>
In
_
I<
2
>
In
_
V
<
2
>
In
_
I<
3
>
In
_
V
<
4
>
In
_
I<
4
>
In
_
V
<
5
>
In
_
I<
5
>
G
a
te
<
1
>
G
a
te
<
2
>
Io
u
t
Io
u
t
MITE Computational Element Details
MITE Filter Details
V
g
<
1
>
V
g
<
1
>
V
g
<
2
>
V
g
<
2
>
V
g
<
3
>
V
g
<
3
>
V
g
<
4
>
V
g
<
4
>
V
g
<
5
>
V
g
<
6
>
Figure 5.7. Details of the MFPAA CAB computational elements.
82
I/O CAB Details
row<11:20>
I_in<1:10>
I_out<1:10>
Output is routed
directly to pads<71:62>.
Output Drivers Broadcast Drivers for Inputs 
V_out<1:10>
row<22,24,26,28,30,
32,34,36,38,40>
I_in<1:10>
row<23,25,27,29,31,
33,35,37,39,41>
I_out<1:10>
Vg_in<1:10> Vg_out<1:10>
Rin<1:10>
from pads<7:16>
Vg_in and Vg_out are shown for
programming purposes.
All V-I converters use Vd<42>
 for programming.
All row and column indexes are referenced to the CAB switch matrix.
The offset for the I/O CAB, (310,46), must be added. 
V-I
Vg_in
Vg_out
I_out
 1      2     3     4      5     6      7     8     9    10
43   45   47   49   51   53   55   57   59   61
42   44   46   48   50   52   54   56   58   60
43   45   47   49   51   53   55   57   59   61
row indexes
Figure 5.8. Programming information for the MFPAA I/O CAB.
83
0 20 40 60 80 100 120
0
50
100
150
200
250
300
350
(11,38)(12,37)
(17,38)(19,0)(20,2) (20,41)
(27,41)(28,19) (29,41)(30,0) (31,20) (31,40)
(32,21) (34,37)(35,39)
(40,40)(41,21) (42,39)
(50,21) (52,38)(54,37)
(11,38)(12,37)(13,36)
(17,21) (18,41)(19,38)(21,37)
(22,40)
(27,41)(28,31) (29,40)(30,36)
(32,0)(33,3) (33,39)
(40,39)(41,35)(42,39)
(50,38)(52,37)(54,36)
(56,19) (56,35)(57,20)
(11,31)
(59,3)(61,2)
(20,0)
Implementation of (x2+y2)1/2
Column
R
o
w
(a)
78 80 82 84 86 88 90
260
265
270
275
280
285
290
(11,38)
(12,37)
(13,36)
(18,41)
(19,38)
(21,37)
(22,40)
(27,41)
(28,31)
(29,40)
(30,36)
(33,39)
(40,39)
(41,35)
(42,39)
Implementation of (x2+y2)1/2
Column
R
o
w
(b)
Figure 5.9. A sample of the GUI for interfacing with the FPAAA is shown. The GUI output is shown
for a vector magnitude circuit.
(a) Full GUI output.
(b) Zoomed in GUI so that more detail can be seen.
84
Figure 5.10. The laboratory setup of the full hardware system is shown. The setup includes the MFPAA
(circled), a PCB, a microcontroller, and a computer for communication(not shown).
85
CHAPTER 6
VOLTAGE-CURRENT AND CURRENT-VOLTAGE CONVERTERS
FOR SYSTEM INTERFACING
6.1 A VI converter
The input/output CAB of the translinear FPAA was also modified from the original design.
Most importantly, the voltage-to-current converter was redesigned for both accuracy and
speed considerations. Fully integrated VI converters have been proposed [22, 23], but the
die area and power consumption required are undesirable. The VI must be able to convert
currents on the order of nanoamps, even down to picoamps at times, without sacrificing the
speed of the entire system. This requires an extremely low input resistance to compensate
for the large capacitance of the bonding pad. This is accomplished by using active feed-
back, shown in Figure 6.1. In addition to allowing high speeds, the amplifier also improves
accuracy by keeping V f ixed from moving. The second amplifier, on the output side of the
current mirror, is included to reduce mismatch in the current mirror by ensuring the drain
voltages of each mirror transistor is Vre f . This design is similar to the one presented in [24].
The bias currents of the VI are generated using floating-gate current sources. The speed of
the VI can be written as
f−3dB =
1
2pi (R ‖ Rin) Cpad ≈
1
2piRinCpad
(6.1)
and its accuracy can be written as
Error ≈ Rin
R
, (6.2)
where
Rin ≈ 1gm1[gm2rds2 (A + 1) + 1] . (6.3)
The amplifiers used are simple pFET-input 5 transistor OTAs with a gain of approximately
gmrds. Vre f is usually set to 0.4V and R is usually 10MΩThe layout of the VI converter is
shown in Figure 6.2.
86
Vref
Vfixed
Ibias1 Ibias2
Iout
M2
M1
M4
M3
Ibias1+Ires
Vin
R
Rin
A A
Cpad
Ires
O
n
-C
h
ip
O
ff
-C
h
ip
Figure 6.1. VI converter used in the improved MITE FPAA. The amplifier on the input side provides
an extremely low input resistance allowing for high speed and good accuracy. The amplifier
on the output side reduces mismatch between the input and output currents by matching
the drain voltages of the mirror transistors. The bias currents are provided by floating
gates.
Figure 6.2. Layout of the improved VI converter. The converter was designed to match the pitch of the
MFPAA and required an area of 140µmx11.6µm.
87
0 1 2 3 4 5
0
100
200
300
400
500
Simulated Transfer Characteristic of Improved VI Converter
Input Voltage (V)
O
ut
pu
t C
ur
re
nt
 (n
A)
0 1 2 3 4 5
−1
−0.5
0
0.5
1
Simulated Accuracy of Improved VI Converter
Input Voltage (V)
Pe
rc
en
t E
rro
r
Figure 6.3. Simulated Transfer Characteristic of the Improved VI Converter. The transfer character-
istic, top, and accuracy, bottom, of the converter are shown.
Simulation results for the improved VI converter are shown in Figures 6.3 and 6.4.
A pad capacitance of 5pF was chosen for simulation purposes. The basic functionality
of the converter is shown in Figure 6.3 which shows the basic transfer characteristic. The
frequency response of the converter is shown in Figure 6.4 for selected input voltages. Note
that the converter’s f−3dB is above 100KHz in all cases. In addition, a harmonic distortional
analysis was performed over a range of output current levels and frequencies. The results
are summarized in Table 6.1.
88
100 102 104 106 108
−80
−70
−60
−50
−40
−30
−20
−10
0
10
20
Frequency (Hz)
O
ut
pu
t M
ag
ni
tu
de
 (d
B)
Simulated Frequency Response of Improved VI Converter
 
 
Vin=0.3V
Vin=1V
Vin=3V
Vin=3V
Vin=4V
Vin=5V
Figure 6.4. Simulated Frequency Response of the Improved VI Converter. The response is shown over
a range of input voltages.
6.2 A logarithmic bidirectional IV converter
In order to make current-mode systems viable it is necessary to be able to send the signals
off chip at high speeds. This is a problem when the currents are of low magnitude, since
they need to charge the capacitance of the pad. Even if an external current-to-voltage (IV)
converter is used, and its input voltage kept almost constant, currents in the picoamp and
nanoamp range would still be too slow. This problem can be solved by using an on-chip
Table 6.1. Harmonic Distortional Analysis of Improved VI Converter (dB)
Input Frequency (Hz) Iout = 10nA Iout = 100nA Iout = 400nA
1K -118.047 -97.711 -80.306
10K -105.561 -85.406 -69.610
20K -99.705 -79.551 -63.892
50K -91.799 -71.658 -56.300
100K -85.274 -65.675 -51.077
89
Vref
Iin Vout
VbiasN
A
VbiasP
VshiftN
VshiftP
I b
ia
s
VgcP
VgcN
Figure 6.5. Schematic of the bidirectional logarithmic IV converter. The converter uses dual feedback
paths to perform a log conversion for both positive and negative input currents. The source
followers provide voltage offsets to ensure the feedback transistors stay saturated. They
also ensure a bias current is running in the direction of the gray arrow, giving the converter
a minimum settling speed even with zero input current. The input current mirror provides
a subtraction of two single-ended signals to give a differential input signal. The output is
again single-ended.
90
IV converter. The IV converter must be able to convert low currents at speed (have ex-
tremely low input resistance), convert currents over a large dynamic range (be compressive
in nature), and convert currents that are both sunk and sourced (be bidirectional). Both the
low input resistance and the compressive transfer function can be achieved by using a log
converter. Many logarithmic amplifiers that behave as IV converters have been introduced
[25–27], but they lack the bidirectionality and the speed required at low currents.
The first generation of the IV converter is shown in Figure 6.5. Note that the converter
uses dual feedback paths to provide the bidirectionality. The source followers in the feed-
back paths serve multiple purposes: they make sure the drains of the log transistors remain
at the input, they keep the log transistors in saturation, and they provide an input bias cur-
rent at all times. The bias current, represented by the gray arrow in the figure, assures
that the input current never goes to zero, which would cause the speed of the converter to
also approach zero. Note that the bias current also sets the minimum current that can be
converted accurately, creating a trade off between dynamic range and speed.
The IV converter must also use gain adaptation in order to remain stable over the large
range of input signals. In order to achieve high speeds, the dominant pole of the system
must remain at the input node. Thus, maintaining adequate phase margin is difficult since
the dominant pole’s frequency is directly proportional to the magnitude of the input current.
If no gain adaptation was used, the converter would become unstable as the input current’s
magnitude increases and the dominant pole moves closer to the higher order poles. IV
converters have been introduced that assume the secondary poles are much higher than the
input pole [26], but an unreasonable amount of power must be burned in order to achieve
high speeds. In addition, varying the bias current of the amplifier in order to maintain sta-
bility is also possible [27], but power consumption and SNR are again negatively affected.
The solution used here is to reduce the loop gain as the output moves away from its zero
current voltage. This is accomplished by connecting the source of a pFET and the source
of an nFET to the output of the amplifier and connecting their gates to biases (the drains are
91
10
−14
10
−12
10
−10
10
−8
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2
O
u
tp
u
t 
V
o
lt
ag
e
 (V
)
Absolute Input Current (|A|)
Positive Currents
Negative Currents
Raw Data
Calibrated Data
Log Fit
Figure 6.6. Transfer function of the IV converter. The gray circles represent the raw data of output
voltage versus input current when read through the picoammeter. The dark dots represent
the calibrated data of output voltage verses an exponential fit of the input current. This
removes the error from the picoammeter at low currents. A logarithmic fit is also shown
for this calibrated data.
connected to gnd and Vdd respectively). This causes the output resistance of the amplifier
to be reduced as the resistance in parallel with it decreases.
The results of the IV converter were taken from the CMOS imager it was first fabricated
on. The DC transfer characteristic is shown in Figure 6.6. Two sets of data are shown—the
gray circles represent the raw data taken and the dark dots represent the calibrated data set.
In addition, a logarithmic fit is shown for both positive and negative currents. Note that the
calibrated data follows the fit much better at lower currents than the original data. This is
due to error in the measurements from the picoammeter. In addition, the error at higher
currents for both the original and calibrated data is due to the log transistors leaving deep
92
Vref
Iin Vout
VbiasN
A
VbiasP
VshiftP
I b
ia
s
VgcP
VgcN
Figure 6.7. Schematic of the improved bidirectional logarithmic IV converter. The converter is similar
to the original design but the nFET follower is removed and the pFET follower is modified
in order to reduce distortion.
93
subthreshold operation and losing their exponential current-to-voltage relationship.
Not seen in Figure 6.6 is the distortion caused by the κ shift in the source followers used
for voltage shifting. This causes the followers to introduce a non-constant shift between
the ouput of the amplifier and the source of the log transistors. This error was corrected in
the new version of the IV converter shown in Figure 6.7. In order to minimize this error
the bulk of the follower transistor must be attached to the source, which is only possible for
pFETs in the process used. The nFET follower was removed for this reason and the pFET
follower was modified accordingly. It is still possible to bias the structure so that it operates
in the same was as the original design—a bias current always flows and the log transistors
are kept in saturation. This new design has been included in a modified CMOS imager.
The improved transfer characteristic of the new IV converter is shown in Figure 6.8. The
improved IV converter was also tested under transient conditions, but the surrounding test
circuitry was the limiting factor in the speed that could be achieved. Thus, simulated results
are presented in 6.9.
94
10−14 10−12 10−10 10−8 10−6
1.3
1.35
1.4
1.45
1.5
1.55
1.6
1.65
Input Current (A)
O
ut
pu
t V
ol
ta
ge
 (V
)
Output Characteristic of Improved IV Converter
 
 
Positive Input Currents
Negative Input Currents
Figure 6.8. The output characteristic of the improved IV converter is shown. Note that a positive input
current is defined as a current flowing into the converter.
95
0 5 10 15 20 25 30 35 40
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2
Time (µs)
O
ut
pu
t V
ol
ta
ge
 (V
)
Simulated Transient Response of the Improved IV Converter
10nA
1nA
100pA
10pA
1pA
1pA
100pA
10pA
1nA
10nA
Figure 6.9. The simulated transient response of the improved IV converter is shown. Each combina-
tion of input current levels, noted by the labels above each settled waveform, were pulsed
between at 5µs and 20µs. Note that the responses shown represent the worst case settling
times since they pass through the zero current level where the speed of the converter is at
its lowest.
96
CHAPTER 7
A FLOATING-GATE PIPELINED ADC
In addition to the IV converter, an ADC was designed to be used in conjunction with it.
Although ADCs have been designed using floating gates including sigma-delta and flash
converters [28–30], a voltage-mode pipelined implementation that uses floating gates in
order to trim offsets for accuracy was designed. The floating-gate biases allow the power
consumption, area, and complexity of the pipelined stage to be greatly reduced when com-
pared to other pipelined architectures that use other calibration techniques [31–33]. Al-
though the architecture of the ADC, shown in Figure 7.1, can be easily extended to as
many bits as needed, a 10-bit converter was designed and built. The system architecture
includes a clock divider and three phase clock tree so that only one input clock is needed
from off chip. In addition, all of the bit shifting required by pipelined converters is also in-
cluded. A closed-loop sample-and-hold was used on the front end because it could meet the
speed requirements while almost completely removing signal-dependent charge injection.
Each pipelined stage of the ADC consists of a comparator and a multiplying DAC
(MDAC), shown in Figure 7.2. The MDAC is based on the precision multiply-by-2 circuit
presented in [34]. The multiplying circuit was altered to include the DAC functionality
by adding the switches connected to V f g3 and V f g4. If the range of the converter is from
V0 to V f s, V f g3 would have the value of V f s and V f g3 would have the value of V f s. These
switches implement a switched-capacitor subtraction of Vin−
(
V f s − V0
)
if the output of the
comparator is high.
97
1
-b
it
p
ip
e
lin
e
d
st
a
g
e
1
-b
it
p
ip
e
lin
e
d
st
a
g
e
1
-b
it
p
ip
e
lin
e
d
st
a
g
e
1
-b
it
p
ip
e
lin
e
d
st
a
g
e
1
-b
it
p
ip
e
lin
e
d
st
a
g
e
1
-b
it
p
ip
e
lin
e
d
st
a
g
e
1
-b
it
p
ip
e
lin
e
d
st
a
g
e
1
-b
it
p
ip
e
lin
e
d
st
a
g
e
1
-b
it
p
ip
e
lin
e
d
st
a
g
e
1
-b
it
p
ip
e
lin
e
d
st
a
g
e
C
lo
se
d
-l
o
o
p
S
a
m
p
le
a
n
d
 H
o
ld
D
ig
it
a
l
D
e
la
y 
E
le
m
e
n
ts
Fl
o
a
ti
n
g
-g
a
te
R
e
fe
re
n
ce
s
Fl
o
a
ti
n
g
-g
a
te
R
e
fe
re
n
ce
s
Fl
o
a
ti
n
g
-g
a
te
R
e
fe
re
n
ce
s
Fl
o
a
ti
n
g
-g
a
te
R
e
fe
re
n
ce
s
Fl
o
a
ti
n
g
-g
a
te
R
e
fe
re
n
ce
s
Fl
o
a
ti
n
g
-g
a
te
R
e
fe
re
n
ce
s
Fl
o
a
ti
n
g
-g
a
te
R
e
fe
re
n
ce
s
Fl
o
a
ti
n
g
-g
a
te
R
e
fe
re
n
ce
s
Fl
o
a
ti
n
g
-g
a
te
R
e
fe
re
n
ce
s
Fl
o
a
ti
n
g
-g
a
te
R
e
fe
re
n
ce
s
P
a
ra
lle
l L
o
a
d
S
h
if
t
R
e
g
is
te
r
Fl
o
a
ti
n
g
-g
a
te
 P
ro
g
ra
m
m
in
g
 C
ir
cu
it
ry
Floating-
gate
Programming
Circuitry
1
:1
6
 C
lo
ck
D
iv
id
e
r
3
 P
h
a
se
C
lo
ck
 T
re
e
A
n
a
lo
g
In
p
u
t
C
lo
ck
D
ig
it
a
l
O
u
tp
u
t
Fi
gu
re
7.
1.
A
rc
hi
te
ct
ur
e
of
th
e
flo
at
in
g-
ga
te
pi
pe
lin
ed
A
D
C
.T
he
A
D
C
co
ns
is
ts
of
10
1-
bi
tp
ip
el
in
ed
st
ag
es
,a
cl
os
ed
-lo
op
sa
m
pl
e
an
d
ho
ld
,fl
oa
tin
g-
ga
te
vo
lta
ge
re
fe
re
nc
es
,a
nd
di
gi
ta
ls
up
po
rt
ci
rc
ui
tr
y.
T
he
flo
at
in
g-
ga
te
vo
lta
ge
re
fe
re
nc
es
ar
e
us
ed
to
tr
im
th
e
off
se
ts
cr
ea
te
d
by
ea
ch
st
ag
e.
98
A total of four floating-gate references are used to trim offsets in each stage. V f g1 is
used to remove the offset internal to the amplifier of the MDAC as well as to compen-
sate for any charge injection from S 1. V f g2 is used to simply remove the offset internal to
the comparator. V f g3 and V f g4 are used to remove error due to the charge injection from
switches S 2 to S 7. The floating-gate voltage references can compensate for any signal-
independent charge injection but the accuracy of the converter will be determined by the
signal-dependent charge injection as well as the precision of the floating-gate program-
ming. The programming must be able to resolve voltages at least as low as half an LSB.
Assuming the accuracy of the floating-gate programming is much higher than the resolu-
tion of the converter, the signal-dependent charge injection will clearly be the dominant
source of error.
The floating-gate reference is shown in Figure 7.3. The voltage reference is a wide-
range OTA with a floating-gate pFET input. The OTA, under unity-gain feedback, acts to
buffer the voltage on the floating node. A wide-range OTA was used in case the output
voltage of the reference needed to be near the rails. The bias current of the references is
set so that the voltages can drive their respective loads. This is easily accomplished for
V f g1 and V f g2, but V f g3 and V f g4 must be able to drive the large capacitive load at the speed
required by the ADC. This requires larger bias currents and creates the headroom issues
that the wide-range OTA helps to alleviate. The layout of the ADC is shown in Figure 7.4
System level simulations were run in order to verify the behavior of the ADC. The
floating-gate references were replaced by voltage sources with a resolution of 200uV . A
simulation was run in order to test the INL of the converter, but because of the computing
power needed for such large simulations, only a limited number of codes were tested. The
results are shown in Figure 7.5. Note that the converter is 10-bit accurate when tested over
a 1V range at a speed of 500KS/S . Again, because of the enormous computing resources
needed for the simulations, DNL was not simulated. This ADC has been fabricated, both
as a separate system and as part of a CMOS imager. The chips are awaiting testing.
99
X1
X2
X2X1
V
fg1
V
fg2
V
fg3
V
fg4
V
in
V
out
2
2
1
3
3
3
Comparator
S
1
S
2
S
3
S
4
S
6
S
7
Figure 7.2. Architecture of the pipelined ADC stage. Each stage consists of a latched comparator and
a switched capacitor multiplying DAC. A total of 4 voltages are set using floating gates in
every bit stage.
Vref
Mprog
Ms1
Ms2
Ms4Ms3
Vtun
Vout
A
Vdrain
Vfg
Rsel
Rsel
Csel
Csel
Figure 7.3. Schematic of the floating-gate voltage reference. The output voltage is simply the voltage
programmed onto a floating-gate, V f g, that has been buffered out by a wide-range OTA.
Transistor Mprog is used to alter the floating-gate charge using channel hot-electron injec-
tion. Transistors Ms1–Ms4 are used for selection in the array programming scheme. Vre f is
used as a global bias when the floating gates are not being programmed.
100
Figure 7.4. Layout of the ADC. The converter was designed to be small, using an area of only
2.1mmx0.64mm.
1.0 1.2 1.4 1.6 1.8 2.0
−0.50
−0.25
0.00
0.25
0.50
Input Voltage (V)
Er
ro
r (
LS
Bs
)
Figure 7.5. Simulation results of the 10-bit ADC. (a)INL error of the ADC for selected codes. (b) Sam-
ple residue of one of the pipelined stages.
101
CHAPTER 8
THE PRESENT AND FUTURE OF TRANSLINEAR FPAAS
A MITE based translinear FPAA capable of implementing a wide range of static and
dynamic functions has been presented. In addition, a software chain capable of network
synthesis, place-and-route, and hardware programming has been implemented. The com-
plete tool chain, from equation to working hardware, is shown in Figure 8.1. While this
system shows the promise of translinear FPAAs, there is still development left to be done.
The future development of translinear FPAAs has a clear path due to elegant nature
of the design flow they use. While a great deal of optimization can be done, the overall
structure of a translinear FPAA is determined by the synthesis procedures used. This is
contrary to the development of other FPAAs, where system level questions, such as which
components to use in a CAB, are still of primary importance. The major areas of devel-
opment left for the MITE FPAA, or other translinear FPAAs, are integration with other
components, optimization of the software chain, and integration of a translinear FPAA into
a mixed-signal reconfigurable platform.
8.1 Integration of Fixed Blocks
While the components of a CAB in a translinear FPAA are determined by the synthesis
procedure, other components can be included for applications that translinear circuits do
not handle efficiently. For example, if an application requires sub-banding of an input
signal, a large number of the MFPAAs filters will be used solely for this purpose. In
addition, a great deal of routing will be used. As sub-banding is a common function in
analog signal processing, a block of tunable bandpass filters can be included as a fixed
block on a translinear FPAA.
102
[switches, prog_biases, Circuit_info, Map_info] = MITE_Compile('[x^2+y^2]^(1/2)');
>> Circuit_info
Circuit_info = 
              mirrors: [2x3 double]
                gates: [3x2 double]
                coeff: [1 1 1]
                loops: [3x6 double]
                 vars: [9x6 double]
           expression: '[x^2+y^2]^(1/2)'
    SolvedExpressions: [2x10 char]
              varlist: [3x4 char]
>> Circuit_info.loops
ans =
     1     5     4     0     0     5
     6     2     0     0     0     7
     8     3     0     0     0     9
>> Circuit_info.gates
ans =
     1     3
     2     2
     2     2
>> Circuit_info.mirrors
ans =
     5     0     0
     9     7     1
>> Circuit_info.vars
ans =
     1     0     0     0     0     0
     2     1     0     0     0     0
     3     1     0     0     0     0
     4     0     0     1     0     1
     5     0     0     0     1     0
     6     0     0     1     0     1
     7     0     0     0     1     0
     8     0     0     1     0     1
     9     0     0     0     1     0
>> Circuit_info.varlist 
ans =
Var1
x   
y 
>> Map_info
Map_info = 
      num_loops: 4
      num_mirrors: 1
      num_inputs: 3
      switches: [2x44 double]
      global: [1x3 struct]
      cab: [6x3 struct]
      outputs: [8 0 0 0 0 0 0 0 0 0]
      inputs: [0 0 0 0 0 0 0 4 3 2]
      prog: [3x30 double]
      loops: [4x5 double]
      mirrors: [5 2 3]
      biases: [2x4 double]
>> Map_info.cab(5,2)
ans = 
      local: [0 0 0 0 7 9 10 99 99 99]
      nnv_up: [0 0 0 0 0 0 0 0 0 8]
      nnv_down: [0 0 0 0 0 0 0 6 1 5]
      df: 1
      loops: [8 7]
      filter: 0
      mirrors: [10 9 1 0 0 0]
      biases: [10 9 0 0 0 0]
      loop1: 'I5^(1)*I9^(1)*I10^(-1)'
      loop2: 'I3^(2)*I4^(-1)'
78 80 82 84 86 88 90
260
265
270
275
280
285
290
(11,38)
(12,37)
(13,36)
(18,41)
(19,38)
(21,37)
(22,40)
(27,41)
(28,31)
(29,40)
(30,36)
(33,39)
(40,39)
(41,35)
(42,39)
Implementation of (x2+y2)1/2
Column
R
o
w
Figure 8.1. Example of the complete system implementing an equation in hardware. The user enters
an equation in MATLAB (top), the circuit is synthesized and routed (middle), and the hard-
ware is programmed accordingly (bottom).
103
8.2 Optimization of the Software Chain
A great deal of the development left to be done for the MITE FPAA is software related.
First, a full synthesis procedure, including dynamic functions, need to be mapped onto the
MFPAA. Currently, only static functions have been implemented using the complete tool
chain. Second, the place-and-route algorithm used must be expanded to include the entire
MFPAA. While this would require a significant amount of time, it is a simple procedure.
While continued development of the MFPAA is rather simple, future generations of
translinear FPAAs should include more advanced routing structures. These structures might
include column-parallel processing with nearest neighbor connections or tree structures.
While these ideas will ultimately be implemented in hardware, the problems of synthesis
and place-and-route are optimization problems that require advanced knowledge in math-
ematics. It may also be possible to leverage some of the work done in developing FPGA
routing networks for use in a translinear FPAA.
8.3 Mixed-signal Reconfigurable Platforms
Another promising application of translinear FPAAs is in their combination with FPGAs
to form a mixed-signal reconfigurable platform. Field-programmable mixed arrays, those
combining analog and digital signals, have been previously proposed [35, 36]. However, a
large scale implementation has yet to be development fully. Again, because of the synthe-
sized design flow facilitated by the translinear FPAA, integration with an FPGA is made
easier. FPGAs use advanced synthesis and place-and-route algorthims, allowing the user
to describe their system in a behaviora llanguage, allowing for the possibility of a simple
user interface incorporating both the analog and digital components of the system.
In addition, combining the MFPAA with a digital processor allows for many possibil-
ities. For example, an analog coprocessor for a computer, similar to the one proposed in
[37], can be easily created. The translinear FPAA would be able to solve complex sets of
104
differential equations without the convergence problems faced by digital computers. Its re-
sults would then be passed to the digital processor as an approximate solution, eliminating
the convergence problem completely. Furthermore, the computer could perform the syn-
thesis, place-and-route, and programming of the FPAA. The realization of such a system
could have a large effect on how analog circuits are designed and built in the future.
105
REFERENCES
[1] C. Twigg, Floating Gate Based Large-Scale Field-Programmable Analog Arrays for
Analog Signal Processing. PhD thesis, Georgia Institute of Technology, 2006.
[2] U.-M. O’Reilly, “Potential uses of dynamically reconfigurable ana-
log circuits,” tech. rep., Massachusetts Institute of Technology,
http://people.csail.mit.edu/unamay/research-abstracts/grace-abstract/grace-
abstract.html.
[3] A. Stoica, D. Keymeulen, R. Zebulum, A. Thakoor, T. Daud, Y. Klimeck, R. Tawel,
and V. Duong, “Evolution of analog circuits on field programmable transistor arrays,”
in Evolvable Hardware, 2000. Proceedings. The Second NASA/DoD Workshop on,
pp. 99–108, 2000.
[4] J. Becker, F. Henrici, S. Trendelenburg, and Y. Manoli, “A rapid prototyping environ-
ment for high-speed reconfigurable analog signal processing,” in Parallel and Dis-
tributed Processing, 2008. IPDPS 2008. IEEE International Symposium on, pp. 1–4,
2008.
[5] B. Pankiewicz, M. Wojcikowski, S. Szczepanski, and Y. Sun, “A field programmable
analog array for cmos continuous-time ota-c filter applications,” Solid-State Circuits,
IEEE Journal of, vol. 37, no. 2, pp. 125–136, 2002.
[6] E. Ramsden, “The isppac family of reconfigurable analog circuits,” in Evolvable
Hardware, 2001. Proceedings. The Third NASA/DoD Workshop on, pp. 176–181,
2001.
[7] E. Lee and P. Gulak, “A transconductor-based field-programmable analog array,” in
Solid-State Circuits Conference, 1995. Digest of Technical Papers. 42nd ISSCC, 1995
IEEE International, pp. 198–199, 366, 1995.
[8] G. Serrano, P. Smith, H. Lo, R. Chawla, T. Hall, C. Twigg, and P. Hasler, “Automatic
rapid programming of large arrays of floating-gate elements,” in Circuits and Systems,
2004. ISCAS ’04. Proceedings of the 2004 International Symposium on, vol. 1, pp. I–
373–I–376 Vol.1, 2004.
[9] P. Smith, M. Kucic, and P. Hasler, “Accurate programming of analog floating-gate
arrays,” in Circuits and Systems, 2002. ISCAS 2002. IEEE International Symposium
on, vol. 5, pp. V–489–V–492 vol.5, 2002.
[10] H.-J. Lo, G. Serrano, P. Hasler, D. Anderson, and B. Minch, “Programmable multiple
input translinear elements,” in Circuits and Systems, 2004. ISCAS ’04. Proceedings of
the 2004 International Symposium on, vol. 1, pp. I–757–60 Vol.1, 2004.
106
[11] B. Minch, “Synthesis of static and dynamic multiple-input translinear element net-
works,” Circuits and Systems I: Regular Papers, IEEE Transactions on [Circuits and
Systems I: Fundamental Theory and Applications, IEEE Transactions on], vol. 51,
no. 2, pp. 409–421, 2004.
[12] J. Mulder, W. Serdijn, A. van der Woerd, and A. van Roermund, Dynamic Translinear
and Log-Domain Circuits: Analysis and Synthesis. Boston, MA: Kluwer ACademic
Publishers, 1999.
[13] S. Subramanian, D. Anderson, and P. Hasler, “Synthesis of static multiple input mul-
tiple output mite networks,” in Circuits and Systems, 2004. ISCAS ’04. Proceedings
of the 2004 International Symposium on, vol. 1, pp. I–189–I–192 Vol.1, 2004.
[14] E. McDonald and B. Minch, “Synthesis of translinear analog signal processing sys-
tems,” in Circuits and Systems, 2002. MWSCAS-2002. The 2002 45th Midwest Sym-
posium on, vol. 1, pp. I–204–7 vol.1, 2002.
[15] K. Odame and B. Minch, “The translinear principle: A general framework for im-
plementing chaotic oscillators,” Bifurcation And Chaos, International Journal Of,
vol. 15, no. 8, pp. 2559–2568, 2005.
[16] S. Subramanian, Methods for Synthesis of Multiple-Input Translinear Element Net-
works. PhD thesis, Georgia Institute of Technology, 2007.
[17] L. Sterpone and M. Violante, “A new reliability-oriented place and route algorithm
for sram-based fpgas,” Computers, IEEE Transactions on, vol. 55, no. 6, pp. 732–
744, 2006.
[18] S. Nag and R. Rutenbar, “Performance-driven simultaneous placement and routing
for fpga’s,” Computer-Aided Design of Integrated Circuits and Systems, IEEE Trans-
actions on, vol. 17, no. 6, pp. 499–518, 1998.
[19] C. Ababei, H. Mogal, and K. Bazargan, “Three-dimensional place and route for fp-
gas,” Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions
on, vol. 25, no. 6, pp. 1132–1140, 2006.
[20] S. Ganesan and R. Vemuri, “A methodology for rapid prototyping of analog systems,”
in Computer Design, 1999. (ICCD ’99) International Conference on, pp. 482–488,
1999.
[21] F. Baskaya, S. Reddy, S. K. Lim, and D. Anderson, “Placement for large-scale
floating-gate field-programable analog arrays,” Very Large Scale Integration (VLSI)
Systems, IEEE Transactions on, vol. 14, no. 8, pp. 906–910, 2006.
[22] B. Fotouhi, “All-mos voltage-to-current converter,” Solid-State Circuits, IEEE Jour-
nal of, vol. 36, no. 1, pp. 147–151, 2001.
107
[23] R. Chen and T.-S. Hung, “A linear cmos voltage-to-current converter,” in Signals, Cir-
cuits and Systems, 2005. ISSCS 2005. International Symposium on, vol. 2, pp. 677–
680 Vol. 2, 2005.
[24] V. Srinivasan, R. Chawla, and P. Hasler, “Linear current-to-voltage and voltage-to-
current converters,” in Circuits and Systems, 2005. 48th Midwest Symposium on,
pp. 675–678 Vol. 1, 2005.
[25] Q.-H. Duong, T. Nguyen, and S.-G. Lee, “Cmos exponential current-to-voltage circuit
based on newly proposed approximation method,” in Circuits and Systems, 2004.
ISCAS ’04. Proceedings of the 2004 International Symposium on, vol. 2, pp. II–865–
8 Vol.2, 2004.
[26] T. Delbruck and C. Mead, “Adaptive photoreceptor with wide dynamic range,” in
Circuits and Systems, 1994. ISCAS ’94., 1994 IEEE International Symposium on,
vol. 4, pp. 339–342 vol.4, 1994.
[27] T. Delbruck and D. Oberhoff, “Self-biasing low power adaptive photoreceptor,” in
Circuits and Systems, 2004. ISCAS ’04. Proceedings of the 2004 International Sym-
posium on, vol. 4, pp. IV–844–7 Vol.4, 2004.
[28] A. Srivastava and R. Anantha, “A programmable oversampling sigma-delta analog-
to-digital converter,” in Circuits and Systems, 2005. 48th Midwest Symposium on,
pp. 539–542 Vol. 1, 2005.
[29] A. Pereira, P. Brady, A. Bandyopadhyay, and P. Hasler, “Experimental investigations
of floating-gate circuits for /spl delta/ - /spl sigma/ modulators,” in Circuits and Sys-
tems, 2002. MWSCAS-2002. The 2002 45th Midwest Symposium on, vol. 1, pp. I–
208–11 vol.1, 2002.
[30] V. Krishnan, C. Duffy, D. Anderson, and P. Hasler, “Optimal quantization employing
programmable flash analog to digital converters,” in Signals, Systems and Comput-
ers, 2004. Conference Record of the Thirty-Eighth Asilomar Conference on, vol. 1,
pp. 816–819 Vol.1, 2004.
[31] S. Tanaka, Y. Ghoda, and Y. Sugimoto, “The realization of a mismatch-free and 1.5-
bit over-sampling pipelined adc,” in Circuits and Systems, 2005. ISCAS 2005. IEEE
International Symposium on, pp. 6194–6197 Vol. 6, 2005.
[32] X. Wang, P. Hurst, and S. Lewis, “A 12-bit 20-msample/s pipelined analog-to-digital
converter with nested digital background calibration,” Solid-State Circuits, IEEE
Journal of, vol. 39, no. 11, pp. 1799–1808, 2004.
[33] S. Lewis, H. Fetterman, J. Gross, G.F., R. Ramachandran, and T. Viswanathan, “A
10-b 20-msample/s analog-to-digital converter,” Solid-State Circuits, IEEE Journal
of, vol. 27, no. 3, pp. 351–358, 1992.
[34] B. Razavi, Design of Analog CMOS Integrated Circuits. New York, NY: McGraw-
Hill, 2001.
108
[35] P. Chow and P. Gulak, “A field-programmable mixed-analog-digital array,” in Field-
Programmable Gate Arrays, 1995. FPGA ’95. Proceedings of the Third International
ACM Symposium on, pp. 104–109, 1995.
[36] J. Faura, C. Horton, B. Krah, J. Cabestany, M. Aguirre, and J. Insenser, “A new field
programmable system-on-a-chip for mixed signal integration,” in European Design
and Test Conference, 1997. ED&TC 97. Proceedings, pp. 610–, 1997.
[37] G. Cowan, R. Melville, and Y. Tsividis, “A vlsi analog computer/digital computer
accelerator,” Solid-State Circuits, IEEE Journal of, vol. 41, no. 1, pp. 42–53, 2006.
109
