Review and Classification of Gain Cell eDRAM Implementations by Teman, Adam et al.
2012 IEEE 27-th Convention of Electrical and Electronics Engineers in Israel
Review and Classification of Gain Cell eDRAM
Implementations
Adam Teman∗, Pascal Meinerzhagen†, Andreas Burg†, and Alexander Fish∗
∗VLSI Systems Center, Ben-Gurion University of the Negev, Be’er Sheva, Israel
Email: teman@ee.bgu.ac.il, afish@ee.bgu.ac.il
†Institute of Electrical Engineering, EPFL, Lausanne, VD, 1015 Switzerland
Email: pascal.meinerzhagen@epfl.ch, andreas.burg@epfl.ch
Abstract—With the increasing requirement of a high-density,
high-performance, low-power alternative to traditional SRAM,
Gain Cell (GC) embedded DRAMs have gained a renewed inter-
est in recent years. Several industrial and academic publications
have presented GC memory implementations for various target
applications, including high-performance processor caches, wire-
less communication memories, and biomedical system storage.
In this paper, we review and compare the recent publications,
examining the design requirements and the implementation
techniques that lead to achievement of the required design
metrics of these applications.
I. INTRODUCTION
Embedded memories consume a dominant part of the overall
area of ASICs and Systems-on-Chip (SoCs), and according to
the 2011 International Technology Roadmap for Semiconduc-
tors (ITRS) [1] this trend will continue into the foreseeable
future. Power dissipation has become the main performance
limiter in modern microprocessors, and larger cache memo-
ries significantly improve micro-architectural performance and
utilization of multi-core systems with only a modest increase
in power [2,3]. In state-of-the-art processors, the die area
devoted to cache memories is approximately 50%; however,
memories occupy significant portions of lower performance
systems and components, as well. The standby power of ultra-
low power systems, such as biomedical implants and wireless
sensor networks, is also often dominated by their embedded
memories that continue to leak during long periods of system
standby.
The traditional choice of embedded memories has been the
6T SRAM, as it provides high-speed read and write perfor-
mance with robust static data retention. However, growing
memory capacities have led to significant efforts to replace the
relatively large SRAM bitcell with a smaller alternative. Con-
current read/write access is an effective method for achieving
high memory bandwidth [4], but two-ported SRAMs require
additional transistors to implement the unit cells, resulting
in even larger area demands. In addition, the off-transistor
leakage currents of SRAM cells have become one of the major
power consuming components in VLSI systems, especially
in standby mode. To combat power consumption, one of the
most effective solutions has been found to be lowering the
system supply voltage (VDD). However, depleted read and
write margins, coupled with increasing process variations,
limit the minimum operating voltage of SRAM arrays. Hence,
an appropriate candidate for the replacement of SRAM would
need to provide high density, low power, and low-voltage
operation, while retaining compatibility with standard logic
fabrication processes [5].
Embedded DRAMs (eDRAMs) have long been a candidate
for replacement of mainstream SRAMs in nanoscale CMOS
due to their small cell size and non-ratioed circuit opera-
tion. However, the conventional 1-transistor, 1-capacitor (1T-
1C) eDRAM requires costly process adders, and provides
limited voltage downscaling, while requiring frequent power
consuming refresh operations [6–8]. A partial solution to these
drawbacks is provided by logic compatible gain-cell (GC)
eDRAMs [7]. While the concept of gain cells dates back to
the early 1970’s, they fell into oblivion due to the predominant
development of dedicated process technologies for stand-alone
SRAM and DRAM chips. Only during the last decade have
GC memories been discovered again as a potential alternative
to SRAM due to their potential for higher density, lower
power consumption, higher reliability, and 2-port functionality
in advanced nodes and at low voltages. Since 2005, around 20
publications from industry and academia show innovative GC
designs and array architectures, mostly aiming at replacing
SRAM as high-speed caches in high-end processors. A few
recent publications design and optimize GC memories for
use in wireless communications systems or other fault-tolerant
systems, while other work has verified the feasibility of low-
voltage operation, making GC arrays good candidates for
biomedical systems. [2,3,5,8–21].
Gain cells are dynamic memory bitcells comprised of
2–3 standard logic transistors and optionally an additional
MOSCAP or diode. The additional devices (as compared to
their 1T counterparts) are used to both increase the in-cell
storage capacitance, as well as amplify the readout charge
flow as compared to the stored charge level, thus providing the
name ”gain” cells [12]. The reduced device count results in a
much higher bitcell density, as compared to a standard SRAM,
while the decoupled read port provides both a non-destructive
read operation and two-ported functionality. Neither read nor
write operations suffer from the ratioed contention between
devices in a 6T SRAM, resulting in increased margins and
enabling voltage scaling [15,19]. Finally, leakage power is
highly reduced, as fewer devices suffer from Drain Induced
Barrier Lowering (DIBL), and scaled supply voltages reduce
other leakage components.
Despite these favorable features, gain cells suffer from
a number of drawbacks. The primary concern is the small
internal storage capacitor that results in short retention times,
requiring power-hungry refresh operations. In addition, the de-
pleted storage voltages following a long retention period result
in poor read performance. These characteristics are highly
dependant on process-voltage-temperature (PVT) variations,
thereby requiring careful margin distribution, cell tracking, and
reference voltage control [2].
In this paper, we examine the various gain cell implemen-
tation options and consider the resulting trade-offs. We review
the methods for contending with the drawbacks and improving
the performance of the circuits. As a result, we will discuss
the compatibility of the existing designs to various target
applications according to energy-efficiency aspects of these
implementations.
II. CATEGORIZATION OF GAIN CELL ARRAYS
From the large number of recent publications on GC mem-
ories, it is possible to identify three main categories of target
applications: 1) high-end processors requiring large embedded
cache memories; 2) fault-tolerant systems including channel
decoders for wireless communications; and 3) low-voltage
low-power biomedical systems.
A. Gain Cells for High-end processors
The vast majority of recent research on GC memories is
dedicated to large embedded cache memories for micropro-
cessors [2,3,5,9,11–14,16,22,23]. In fact, GC memories are
considered to be an interesting alternative to SRAM, which has
been the dominant solution for cache memories for decades.
This is due to GC eDRAM’s higher density, increased speed,
and potentially lower leakage power. Besides the obvious
advantage of high integration density, the main design goal
for GC memories in this application category are high speed
operation and high memory bandwith, especially for industrial
players like IBM [13] and Intel [5,22], and recently also for
academia [3,23]. A smaller number of research groups specify
low power consumption as their primary design goal [2,14]. A
recent study shows that GC memories can potentially consume
less data retention power (i.e., the sum of leakage power and
refresh power) than SRAM arrays (leakage power only) [3].
B. General Systems-on-Chip
Several authors are not very specific about their target
applications [7,10,24], as they only mention general SoCs.
However, they follow the same trend as the aforementioned
processor community by proposing GC memories as a re-
placement for the mainstream 6T SRAM solution. For these
SoC applications, the main drivers are the potential for higher
density and lower power consumption than SRAM.
C. Gain Cells for Wireless Communications Systems
A small number of recently presented GC memory designs
are fundamentally different from the aforementioned work, as
they are specifically built and optimized for systems which
require only short retention times, and, in some cases, are toler-
ant to a small number of hardware defects (read failures) [25].
The refresh-free GC memory used in a recently published low-
density parity check (LDPC) decoder is periodically updated
with new data, and therefore requires a retention time of
only 20 ns [20]. Besides safely skipping power-hungry refresh
cycles and designing for low retention times, the work in [8,21]
also exploits the fact that wireless communications systems
and other fault-tolerant systems are inherently resilient to
a small number of hardware defects. In fact, by proposing
memories based on multilevel GCs, the storage density of GC
memories is further increased at the price of a small number
of read failures which do not significantly impede the system
performance [8,21].
D. Gain Cells for Biomedical Systems
While the previously described target applications require
relatively high memory bandwidth, several recent GC memory
publications target low-voltage low-power biomedical appli-
cations. A GC memory implemented in a mature low-leakage
65 90 120 150 180
10−3
10−2
10−1
100
101
102
103
[9] [12] [11] [24] 
[5] 
[15] 
[15] 
[2] [3] 
[3] 
[18] [21]
[20]
[19]
Technology node [nm]
Ba
nd
w
id
th
 [G
b/s
]
Wireless & SoC [4]
High−end processors
Biomedical
Fig. 1. Bandwidth vs. technology node of several previously published
studies.
180 nm CMOS process achieves low retention power through
voltage scaling well below the the nominal supply voltage [15].
The positive impact of supply voltage scaling on retention time
for given access statistics and a given write bitline control
scheme is demonstrated in [18], proposing near-threshold
(NVT) operation for longer retention times and therefore
lower retention power. A recent study [19] shows that the
supply voltage of GC arrays can even be scaled down to the
subthreshold (sub-VT) domain, while still guaranteeing robust
operation and high memory availibility for read and write
operations.
E. Comparison of State-of-the-Art Implementations
Fig. 1 shows the bandwidth and the technology node of
state-of-the-art GC memory implementations, highlighted ac-
cording to target application categories. References appearing
multiple times correspond to different operating modes or
operating points of the same design. A more than four orders-
of-magnitude difference in the achieved memory bandwidth
among the various implementations. GC memories designed
as cache memory for processors achieve around 10 Gb/s if
implemented in older technologies and over 100 Gb/s if im-
plemented in a more advanced 65 nm node. Most memories
designed for wireless communications systems or generally
for SoCs still achieve bandwidths between 1 and 10 Gb/s.
Only the high-density multilevel GC array has a lower band-
width due to a slow successive approximation multilevel read
operation [21]. GC memories targeted towards biomedical
systems are preferably implemented in a mature, reliable
180 nm CMOS node and achieve sufficiently high bandwidths
between 10 Mb/s and several 100 Mb/s at NVT or sub-VT
supply voltages.
Fig. 2 plots the retention power (i.e., the sum of re-
fresh power and leakage power) of previously reported GC
memories versus their retention time. For energy-constrained
biomedical systems, long retention times of 1–10 ms are a
key design goal in order to achieve low retention power of
between 600 fW/bit and 10 pW/bit. The memory banks of the
LDPC decoder have a nominal retention time of 1.6 µ [20],
which is around four orders-of-magnitude lower than that of
the arrays targeted at biomedical systems. Even though the
reported power consumption of 5 µW/bit corresponds to active
100 101 102 103 104
102
104
106
108
1010
[11][5]
[15]
[2]
[3]
[3]
[18]
[20]
Retention time [ µs]
Re
te
nt
io
n 
po
we
r [f
W/
bit
]
Biomedical
Wireless
High−end processors
Fig. 2. Retention Power vs. Retention Time for several previously published
studies.
power [20], it is fair to compare it to the retention power
of other implementations, as data would anyway need to be
refreshed at the same rate as new data is written. Interestingly,
the power consumption per bit of this refresh-free eDRAM
is almost seven orders-of-magnitude higher than the retention
power per bit of the most efficient eDRAM implementation for
biomedical systems. The retention time and retention power
of GC memories for processors are in between the values for
the wireless and biomedical application domains. Overall, of
course, it is clearly visible that enhancing the retention time
is an efficient way to lower the retention power.
The area cost per bit (ACPB) is defined as the silicon area
of the entire memory macro (including peripheral circuits),
divided by the storage capacity. As opposed to the simple
bitcell size metric, ACPB accounts for the area overhead of
peripheral circuits and is a more suitable metric to compare
different memory implementations. Moreover, we define the
array efficiency as the bitcell size divided by the ACPB
to normalize this metric independent of technology node.
Fig. 3 shows the comparably higher ACPB of biomedical GC
memories due to the use of a mature 180 nm CMOS node.
However, despite their small storage capacity requirements,
these implementations achieve a high array efficiency of over
0.5, by using small yet slow peripherals [15]. On the other
hand, none of the GC memories targeted toward processors,
wireless communications, or SoC applications achieves an
array efficiency as high as 0.5, meaning that over half of the
area of the macrocell is occupied by peripheral circuits.
III. CIRCUIT TECHNIQUES FOR TARGET APPLICATIONS
In the previous section, we examined the recently proposed
GC arrays and analyzed their target systems and applications.
A primary conclusion was that gain cells have been shown to
be an attractive alternative to traditional SRAM arrays for large
caches, ultra-low power systems, and wireless communication
systems. In this section, we will take a closer look at the
circuits used in these proposals, and analyze the compatibility
of these techniques with their target metrics.
A. Gain-Cell Topologies
An extensive comparison between recent GC topologies is
presented in Table I. The common feature for all these circuits
0 2 4 6 8 10
0.2
0.3
0.4
0.5
0.6
0.7
[10]
[11]
[5]
[15]
[3]
[18]
[21]
Area cost per bit [µm2/bit]
Ar
ra
y e

cie
nc
y Biomedical
High−end processors
Wireless
SoC
Fig. 3. Array efficiency vs. Area Cost Per Bit for several previously published
studies.
is their reduced device count, as compared to traditional
SRAM circuits. The highest device count appears in [13],
comprising three transistors and a gated diode, with all other
proposals made up of two [3,5,11,15,18,19,22] or three [2,8–
10,12,14,20,21,24] transistors. The obvious implication of the
transistor count is the bitcell size; however, the choice of the
topology is application dependent, as well. The simple struc-
ture of the 2T topologies usually includes a write transistor
(MW) and a read transistor (MR). MW connects the write
bit line (WBL) to the storage node when the write word line
(WWL) is asserted, and MR amplifies the stored signal by
driving a current through the read bit line (RBL) when the
read word line (RWL) is asserted. The 2T structure results in
coupling effects between the control lines and storage node,
which can affect the data and degrade performance. Therefore,
a third device is often added, primarily to decouple RBL from
the storage node and reduce RBL leakage. This option enables
the designer to trade off density for enhanced performance, ro-
bustness, and/or retention time. This trade-off is quite apparent
in the cache designs, as the larger capacity systems [3,5,11]
prefer the 2T topology at the cost of additional hardware to
retain performance. The Boosted 3T topology of [2] actually
utilizes the coupling effect to extend the retention time by
connecting MR to RWL rather than ground, thereby negating
some of the positive voltage step inherent to the PMOS MW
configurations. An interesting choice of the 2T topology was
used in [19] even though the target application was a small
array for ultra-low power biomedical sensors. In this case, the
stacked readout path of the 3T topology proved to be too slow
under sub-VT biases.
One of the basic considerations that differentiate between
high-performance and low-power systems is the refresh power.
Whereas high-performance systems may employ a destructive
read operation with write-back, low-power systems ensure a
non-destructive read and try to maintain high retention times
to minimize refresh power. This is apparent in the “Main
Design Metric” row of Table I, showing orders-of-magnitude
difference in retention time between the two target categories.
B. Device Choices
The majority of today’s CMOS process technologies pro-
vide several device choices, manipulating different oxide
TABLE I
DRIVER OPERATING MODES
Category High Performance Processor Caches
Publication [9,12,24] [11] [13] [5,22] [2,14] [3]
Bitcell
MW
MR
GD
BL
WWL
RWL GND/
Vbias
MA
MS
WL
Plate Line (-100mV)
BL MW
MS
GD
WBL
WWL
RWL
MRRWL
RWL
MRMW
RWLW
BL
WWL
RB
L
MW
RWL
W
BL WWL
RB
L
MS
MR
MRMW
RWLW
BL
WWL
RB
L
Tech. Node
0.12 µm, 0.13 µm,
65 nm PTM
0.15 µm 90 nm 65 nm 65 nm 65 nm
Techniques
Gated Diode,
Footer Power Gating,
Foot Driver
Multi-Level Bitlines,
Hybrid open bitline
architecture
Gated Diode
Sense Amplifier
RBL Clamping,
Pipelined
Architecture
Boosted 3T,
PVT tracking read
reference feedback,
Regulated WBL
Half Swing
WBL,
Stepped WWL
Main
Design
Metric
400 MHz,
70 µs retention,
100 kb
400 MHz,
100 µs retention,
1 Mb
up to 2 GHz,
110 µs retention,
40 kb
2 GHz,
10 µs retention,
2 Mb
500 MHz,
up to 1.25 ms ret.,
64 kb
667 MHz,
110 µs ret.,
192 kb
Category General SoC Wireless Low Power Biomedical Systems
Publication [10] [8,21] [20] [15] [18] [19]
Bitcell
MW MS
MR
RWL
WWL
W
BL VSR R
BL
MW MS
MR
RWL
WWL
W
BL RB
L MW
RWL
W
BL WWL
RB
L
MS
MR
MRMW
RWLW
BL
WWL
RB
L
MRMW
RWLW
BL
WWL
RB
L
MRMW
RWLW
BL
WWL
RB
L
Tech. Node 90 nm 90 nm 65 nm 0.18 µm 0.18 µm 0.18 µm
Techniques
Forced Feedback,
Write Echo Refresh
Multi Level Bitcell,
PVT Replica Column
Refresh Free,
Sequential
Decoding
I/O Write Transistor,
Low Area Sense
Buffer
Low Area Sense
Buffer
Hybrid Cell
with I/O MW,
Sense Buffer
Main
Design
Metric
VDD =0.5 V,
180 µA ref. power,
5 MHz
2–50 µs retention,
1.45 µm2/bit density
32× 1 kb arrays,
700 MHz,
170 ns retention
VDD=0.75 V, up to 306
ms ret., 0.1–1 MHz,
662 fW/bit ret. power
VDD=0.75 V,
3.3 ms retention,
11.9 pW/bit ret. power
VDD =400 mV,
over 40 ms ret.,
500 kHz
thicknesses and channel implants to create several threshold
(VT) and voltage tolerance options. Careful choice of the
appropriate device (PMOS/NMOS, standard/high/low VT) can
provide orders-of-magnitude improvement in GC performance,
as apparent in Table I. PMOS devices suffer from lower drive
strength than their NMOS counterparts, but have substantially
lower sub-VT and gate leakage. Since the majority of GC
implementations are read access limited, PMOS devices are
used in the vast majority of the proposed circuits. For most of
the common process technologies, the primary cause of storage
node charge loss is sub-VT leakage through MW, and therefore
the ultra-low power implementations [15,19] employ a high-
VT or I/O PMOS to substantially extend retention time. Gate
leakage is a substantial contributor in thin oxide nodes, and so
the all-PMOS 2T configuration [5] balances the sub-VT and
gate leakages out of and in to the storage node to improve
retention time. The decoder system of [20] requires high
performance with very short retention times, and therefore an
all NMOS low-VT circuit is used. Low-VT devices are used
in the readout path of several other publications [3,10], to
improve read performance without a large static power penalty,
as the voltage drop over the read node is minimal during write
and standby cycles.
An important effect caused by device choice selection is
the storage node coupling and charge injection. WWL access
significantly modifies the initial level of the storage node,
depending on several factors. A PMOS write transistor passes
a weak ′1′, and an NMOS passes a weak ′0′; therefore an
underdrive (PMOS) or boosted (NMOS) access voltage of
WWL is necessary to pass a full level to the storage node.
However, the larger the WWL swing is, the larger the step in
the direction of the deassertion at the storage node. A PMOS
MW, for example, is cut-off by the rising edge of WWL,
resulting in both capacitive coupling and charge injection to
the storage node. Therefore, the initial ′0′ value will always
be significantly higher than ground for a PMOS MW, and the
initial ′1′ value will be significantly lower than VDD for an
NMOS device. This limits the storage node range and degrades
both the readout overdrive, as well as the retention time. Using
a same-type device for MR of a 2T cell induces an additional
step in the same direction during read access, further impeding
the performance. A hybrid cell, mixing NMOS and PMOS
devices [3,8,10,19,21], can be used to combat these effects at
the expense of in-bit well separation.
C. Circuit Techniques
In addition to the choice of a circuit topology and device
options, several circuit techniques have been demonstrated to
further improve system performance according to the target
application. One simple and efficient technique is the employ-
ment of a sense buffer in place of a standard sense amplifier
(SA) in low-power systems [15,18,19]. This implementation
requires a larger RBL swing, trading off speed for area and
PVT sensitivity. The area trade-off is apparent in Fig. 3
as [15] shows exceptionally high area efficiency. Several
other SA configurations have been demonstrated to deal with
various design challenges. The authors of [11] proposed a
force feedback SA to enable operation at voltages as low
as 0.5 V. Chun, et al. [3] overcome the problem of small
RBL voltage swing by using a current mode SA featuring
a cross-coupled PMOS latch and pseudo-PMOS diode pairs.
Other SA designs used include p-type gated diodes [9,12,13],
offset compensating amps [11], single-ended thyristors [20],
and standard latches [5]. The most complex sensing scheme
is used for Multi-Level Bitcells in [8,21]. To decipher the four
data levels, a successive approximation SA is used.
Several publications [10,15,18,19] discharge WBL during
non-write operations to extend retention time that is worse
for a stored ′0′ than a ′1′ with a PMOS WM. A Write Echo
Refresh technique was employed by Ichihashi, et al. [10], to
further reduce the WBL=′1′ disturbance. In this technique,
the number of ′1′ write-back operations during refresh are
counted and oppositely biased to combat the disturbance. The
authors of [2] recognized that the steady state level of a ′1′
and ′0′ is common, so they monitor this level and use it as
the WBL voltage for writing a ′1′. This minimizes the ′0′
level disturbance without impeding the worst-case ′1′ level.
For the system proposed in [3], WBL switching speed is the
performance bottleneck, and therefore a half-swing WBL is
employed, improving the write speed and reducing the write
power.
An issue that is rarely discussed in 2T bitcell imple-
mentations is the voltage saturation of RBL during readout.
Depending on the implementation of MR, readout is achieved
by either charging (NMOS) or discharging (PMOS) RBL.
However, once RBL crosses a threshold (depending on the
current ratio of the selected bitcell and the number of off
unselected cells), a steady state is reached. This phenomena
not only limits the swing available for RBL sensing, but also
causes static current dissipation that is present throughout the
entire read operation. This is one of the phenomena considered
in the analysis of [18] resulting in an optimal choice of VDD
for a low-power GC. Somasekhar, et al. [5] combat the self
clamping of RBL by explicitly clamping its voltage under with
designated devices.
IV. CONCLUSION
In this paper, we reviewed and compared the recently
proposed GC memories, categorizing them according to target
applications and overviewing the characteristics that make
them appropriate for these applications. A closer look into
the circuit design of these arrays provided further insight
into the methods used to achieve the required design metrics
through the use of different bitcell topologies, device options,
technology nodes, and peripheral implementations. To summa-
rize briefly, the following best practice guidelines should be
followed when designing GC arrays for future applications:
• High-VT write access transistors for long retention times
and low refresh power, in conjunction with area-efficient
sense buffers for high array efficiency are most suitable
to meet the storage requirements of biomedical systems.
• High-speed applications should use sensitive sense ampli-
fiers to overcome small voltage differences, and should
consider the use of LVT readout transistors for improved
read access.
• Frequently updating wireless communication systems can
trade-off high-speed access for limited retention time to
achieve improved bandwidth.
ACKNOWLEDGMENT
This work was kindly supported by the Swiss National
Science Foundation under the project number PP002-119057.
REFERENCES
[1] “International technology roadmap for semiconductors,” 2009. [Online].
Available: http://www.itrs.net
[2] K. C. Chun et al., “A 3T gain cell embedded DRAM utilizing prefer-
ential boosting for high density and low power on-die caches,” IEEE
JSSC, 2011.
[3] K. Chun et al., “A 667 MHz logic-compatible embedded DRAM
featuring an asymmetric 2T gain cell for high speed on-die caches,”
IEEE JSSC, 2012.
[4] M. Kaku et al., “An 833MHz pseudo-two-port embedded DRAM for
graphics applications,” in Proc. IEEE ISSCC, 2008.
[5] D. Somasekhar et al., “2 GHz 2 Mb 2T gain cell memory macro with
128 GBytes/sec bandwidth in a 65 nm logic process technology,” IEEE
JSSC, vol. 44, no. 1, pp. 174–185, 2009.
[6] S. Hong et al., “Low-votage DRAM sensing scheme with offset-
cancellation sense amplifier,” IEEE JSSC, 2002.
[7] N. Ikeda et al., “A novel logic compatible gain cell with two transistors
and one capacitor,” in Proc. of Symposium on VLSI Technology, 2000,
pp. 168–169.
[8] P. Meinerzhagen et al., “Design and failure analysis of logic-compatible
multilevel gain-cell-based DRAM for fault-tolerant VLSI systems,” in
Proc. IEEE GLSVLSI, 2011.
[9] W. Luk and R. Dennard, “2T1D memory cell with voltage gain,” in
Proc. of IEEE Symposium on VLSIC, 2004, pp. 184–187.
[10] M. Ichihashi et al., “0.5 V asymmetric three-tr. cell (ATC) DRAM using
90nm generic CMOS logic process,” in Proc. IEEE Symposium on VLSI
Circuits, 2005, pp. 366–369.
[11] D. Somasekhar et al., “A 10Mbit, 15GBytes/sec bandwidth 1T DRAM
chip with planar mos storage capacitor in an unmodified 150nm logic
process for high-density on-chip memory applications,” in Proc. of IEEE
ESSCIRC, 2005, pp. 355–358.
[12] W. Luk and R. Dennard, “A novel dynamic memory cell with internal
voltage gain,” IEEE JSSC, vol. 40, no. 4, pp. 884 – 894, April 2005.
[13] W. Luk et al., “A 3-transistor DRAM cell with gated diode for enhanced
speed and retention time,” in Proc. IEEE Symposium on VLSI Circuits,
2006, pp. 184–185.
[14] K. C. Chun et al., “A sub-0.9V logic-compatible embedded DRAM with
boosted 3T gain cell, regulated bit-line write scheme and PVT-tracking
read reference bias,” in Proc. IEEE Symposium on VLSI Circuits, 2009.
[15] Y. Lee et al., “A 5.4nW/kB retention power logic-compatible embedded
DRAM with 2T dual-VT gain cell for low power sensing applicaions,”
in Proc. IEEE A-SSCC, 2010.
[16] W. Zhang et al., “Variation aware performance analysis of gain cell
embedded DRAMs,” in Proc. ACM/IEEE ISLPED, pp. 19–24.
[17] K. C. Chun et al., “Logic-compatible embedded DRAM design for
memory intensive low power systems,” in Proc. of IEEE ISCAS, June
2010, pp. 277–280.
[18] R. Iqbal et al., “Two-port low-power gain-cell storage array: voltage
scaling and retention time,” in Proc. IEEE ISCAS, 2012.
[19] P. Meinerzhagen, A. Teman et al., “A sub-VT 2T gain-cell memory for
biomedical applications,” in Proc. IEEE Sub-VT, Pre-Publication - 2012.
[20] Y. Park et al., “A 1.6 mm2 38-mW 1.5 Gb/s LDPC decoder enabled by
refresh-free embedded DRAM,” in Proc. of IEEE Symposium on VLSIC,
2012, pp. 114–115.
[21] M. Khalid, P. Meinerzhagen, and A. Burg, “Replica bit-line technique
for embedded multilevel gain-cell DRAM,” in Proc. of IEEE NEWCAS,
June 2012, p. pp.
[22] D. Somasekhar et al., “2GHz 2Mb 2T gain-cell memory macro with
128GB/s bandwidth in a 65nm logic process,” in Proc. IEEE ISSCC,
2008.
[23] K. C. Chun et al., “A 1.1V, 667MHz random cycle, asymmetric 2T
gain cell embedded DRAM with a 99.9 percentile retention time of 110
µsec,” in Proc. IEEE Symposium on VLSIC, June 2010, pp. 191 –192.
[24] M. Chang et al., “A 65nm low power 2T1D embedded DRAM with
leakage current reduction,” in Proc. IEEE SOCC, 2007, pp. 207–210.
[25] G. Karakonstantis, C. Roth, C. Benkeser, and A. Burg, “On the exploita-
tion of the inherent error resilience of wireless systems under unreliable
silicon,” in Proc. of IEEE DAC, June 2012, pp. 510 –515.
