Thermal profiling in CMOS/memristor hybrid architectures by Merkel, Cory




Thermal profiling in CMOS/memristor hybrid
architectures
Cory Merkel
Follow this and additional works at: http://scholarworks.rit.edu/theses
This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion
in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact ritscholarworks@rit.edu.
Recommended Citation
Merkel, Cory, "Thermal profiling in CMOS/memristor hybrid architectures" (2011). Thesis. Rochester Institute of Technology.
Accessed from




A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of
Master of Science in Computer Engineering
Supervised by
Dr. Dhireesha Kudithipudi
Department of Computer Engineering
Kate Gleason College of Engineering





Assistant Professor, Department of Computer Engineering
Primary Adviser
Dr. James Moon
Associate Professor, Department of Electrical and Microelectronic Engineering
Dr. Marcin Lukowiak
Assistant Professor, Department of Computer Engineering
Dr. Santosh Kurinec
Professor, Department of Electrical and Microelectronic Engineering
Thesis Release Permission Form
Rochester Institute of Technology
Kate Gleason College of Engineering
Title:
Thermal Profiling in CMOS/Memristor Hybrid Architectures
I, Cory E. Merkel, hereby grant permission to the Wallace Memorial Library to repro-




To my family, for all of your love and support.
iii
Acknowledgments
This work would not have been possible without the help of countless faculty, staff, and
peers. I would like to thank my advisor, Dr. Dhireesha Kudithipudi for her guidance and
dedication to this work. I would also like to thank Dr. James Moon, Dr. Marcin
Lukowiak, and Dr. Santosh Kurinec for taking time out of their busy schedules to serve as
committee members. Finally, I am extremely grateful for all of the help and support from
the RIT Computer Engineering staff, including Joe Walton and Richard Tolleson.
iv
Abstract
CMOS/memristor hybrid architectures combine conventional CMOS processing elements
with thin-film memristor-based crossbar circuits for high-density reconfigurable systems.
These architectures have received an explosive growth in research over the past few years
due to the first practical demonstration of a thin-film memristor in 2008. The reliability and
lifetimes of both the CMOS and memristor partitions of these architectures are severely
affected by temperature variations across the chip. Therefore, it is expected that dynamic
thermal management (DTM) mechanisms will be needed to improve their reliability and
lifetime.
This thesis explores one aspect of DTM–thermal profiling–in a CMOS/memristor mem-
ory architecture. A temperature sensing resistive random access memory (TSRRAM) was
designed. Temperature information is extracted from the TSRRAM by measuring the write
time of thin-film memristors. Active and passive sensing mechanisms are also introduced
as means for DTM algorithms to determine the thermal profile of the chip. Crosstherm,
a simulation framework, was developed to analyze the effects of temperature variations in
CMOS/memristor architectures.
The TSRRAM design was simulated using the Crosstherm framework for four CMOS
processor benchmarks. Passive sensing produced a mean absolute sensor error across all
benchmarks of 2.14 K. The size of the DTM unit’s memory was also shown to have a
significant impact on the accuracy of extracted thermal data during passive sensing. Ac-
tive sensing was also demonstrated to show the effect of dynamic adjustment of sensor
resolution on the accuracy of hotspot temperature estimations.
v
Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
1 Motivation and Supporting Work . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Thermal Management in CMOS Integrated Circuits . . . . . . . . . . . . . 2
1.1.1 Impact of Process Scaling . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Dynamic Thermal Management . . . . . . . . . . . . . . . . . . . 5
1.2 Emerging CMOS/Memristor Hybrid Architectures . . . . . . . . . . . . . 8
1.3 Summary and Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Memristance and Memristive Systems . . . . . . . . . . . . . . . . . . . . 13
2.1 Overview of Memristance and Memristive Systems . . . . . . . . . . . . . 13
2.2 Thin-film Memristors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Thin-film Memristor Models . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.1 Linear Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.2 Exponential Model . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 Thin-film Memristor Temperature Dependence . . . . . . . . . . . . . . . 21
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3 Resistive Random Access Memory . . . . . . . . . . . . . . . . . . . . . . . 23
3.1 Single Memristor as 2-Level RRAM Element . . . . . . . . . . . . . . . . 24
3.1.1 Read Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.1.2 Write Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 2-level RRAM Architectures . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Current Sneak Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 Multi-level RRAM Architectures . . . . . . . . . . . . . . . . . . . . . . . 28
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
vi
4 Temperature Sensing Resistive Random Access Memory Design . . . . . . 31
4.1 Temperature Sensing RRAM Block . . . . . . . . . . . . . . . . . . . . . 33
4.1.1 Idle and Read Operation . . . . . . . . . . . . . . . . . . . . . . . 34
4.1.2 Write Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.3 Temperature Sensing . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Temperature Sensing RRAM Group . . . . . . . . . . . . . . . . . . . . . 36
4.3 Temperature Sensing RRAM . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4 TSRRAM Design Space Exploration . . . . . . . . . . . . . . . . . . . . . 40
4.5 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.6 Active and Passive Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.1 Crosstherm Simulation Framework . . . . . . . . . . . . . . . . . . . . . . 48
5.2 Design Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.3 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.1 Calibration Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.2 Passive Sensing Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.2.1 Sensor Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.2.2 Analysis of DTM Memory Size . . . . . . . . . . . . . . . . . . . 60
6.3 Active Sensing Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.4 Performance Overheads . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
A.1 Linear Ionic Drift Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
vii
List of Figures
1.1 ITRS Power Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 ITRS Supply Voltages and Gate Lengths . . . . . . . . . . . . . . . . . . . 3
1.3 Intel Core Duo Thermal Profile . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 ITRS Equivalent Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 CMOS/Nano Hybrid Architecture . . . . . . . . . . . . . . . . . . . . . . 11
2.1 Fundamental Circuit Quantities . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Thin-film Memristor in a Nanowire Crossbar Configuration . . . . . . . . . 16
2.3 Thin-film Memristor I-V Curves . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Thin-film Memristor Diffusion Temperature Dependence . . . . . . . . . . 20
2.5 Memristor Write Time vs. Temperature . . . . . . . . . . . . . . . . . . . 21
3.1 Memristor as 2-Level RRAM Element . . . . . . . . . . . . . . . . . . . . 24
3.2 Memristor State Variable Transformation . . . . . . . . . . . . . . . . . . 25
3.3 General RRAM Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 Crossbar Sneak Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.5 Thin-film memristor as a multi-level RRAM element . . . . . . . . . . . . 29
4.1 Proposed dual-purpose temperature sensing/memory architecture for ther-
mal profiling in CMOS/memristor hybrid architectures. . . . . . . . . . . . 32
4.2 M ×N TSRRAM Block . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3 TSRRAM Block Timer Finite State Machine . . . . . . . . . . . . . . . . 35
4.4 B-bit TSRRAM Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.5 TSRRAM Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.6 Top-level Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.7 TSRRAM Controller Finite State Machine . . . . . . . . . . . . . . . . . . 39
4.8 Temperature Register and Finite State . . . . . . . . . . . . . . . . . . . . 39
4.9 TSRRAM Design Space . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.10 Improved TSRRAM Block . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.11 Improved TSRRAM Block Timer Finite State Machine . . . . . . . . . . . 43
viii
4.12 Improved TSRRAM Group and TSRRAM Architectures . . . . . . . . . . 44
4.13 Improved Top-level Architecture . . . . . . . . . . . . . . . . . . . . . . . 44
4.14 Improved TSRRAM Controller Finite State Machine . . . . . . . . . . . . 45
4.15 Improved Temperature Register and Finite State Machine . . . . . . . . . . 45
4.16 Memristor Write Speed Variation . . . . . . . . . . . . . . . . . . . . . . . 46
5.1 High-level Block Diagram of the Crosstherm Simulation Framework . . . . 49
5.2 Choosing Rref w0 and Rref w1. . . . . . . . . . . . . . . . . . . . . . . . . 51
5.3 CMOS and Crossbar Layer Floorplans . . . . . . . . . . . . . . . . . . . . 52
6.1 Calibration Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.2 Number of Passive Temperature Measurements for Each Thermal Cycle . . 56
6.3 Passive Sensing Measurements for GCC Benchmark over 1 Thermal Cycle 58
6.4 GCC benchmark (a) error and (b) absolute error. . . . . . . . . . . . . . . . 59
6.5 Time Quantization Error . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.6 GCC Error vs. Temperature and Maximum Error vs. Temperature . . . . . 59
6.7 Illustration of Coverage Metric . . . . . . . . . . . . . . . . . . . . . . . . 61
6.8 Coverage and Mean Hotspot Errors for Varying DTM Memory Sizes . . . . 62
6.9 Finding Global Hotspot in GCC Benchmark–Low Sensor Resolution . . . . 64
6.10 Finding Global Hotspot in GCC Benchmark–Higher Sensor Resolution . . 65
6.11 Effect of Active Temperature Resolution . . . . . . . . . . . . . . . . . . . 65
ix
List of Tables
2.1 Switching Processes in Thin-Film Memristors . . . . . . . . . . . . . . . . 15
5.1 Design Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.1 SPEC2000 CPU Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.2 Passive Sensing Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
x
Chapter 1
Motivation and Supporting Work
The limitations and adverse effects of CMOS process scaling have motivated a new class of
computational architectures called CMOS/nano hybrid architectures. CMOS/nano hybrid
architectures augment traditional CMOS architectures with novel materials, devices, and
circuits, allowing IC scaling beyond the end of the CMOS roadmap. CMOS/memristor hy-
brid architectures–a subclass of CMOS/nano hybrid architectures–integrate thin-film mem-
ristive crossbar circuits with CMOS devices, offering the possibility of extremely high-
density reconfigurable systems [1]. Over the past few years, there has been an explosive
growth in research on these architectures, yielding several practical demonstrations of hy-
brid logic and memory designs [2–11].
In order for CMOS/memristor architectures to become commercializable, several of
their reliability concerns must be addressed. Due to their close proximity, memristor par-
titions of the architectures will be affected by thermal gradients in CMOS partitions. Due
to their small feature sizes, temperature variations will have a significant effect on mem-
ristor crossbar circuits’ performance and reliability. Therefore, these architectures will
require dynamic thermal management (DTM) schemes to maximize device lifetimes and
mitigate reliability concerns. DTM in traditional CMOS architectures has become a well-
established research domain. Currently, however, no work exists on thermal management
in next-generation CMOS/memristor hybrid architectures. This thesis work will explore
one aspect of thermal management–thermal profiling–in a CMOS/memristor hybrid mem-
ory architecture.
1
The rest of this chapter is organized as follows: Section 1.1 gives an introduction to
thermal management in CMOS ICs. Section 1.2 provides an overview of CMOS/memristor
hybrid architectures. Finally, Section 1.3, summarizes the motivation and lists the contri-
butions of this thesis work.
1.1 Thermal Management in CMOS Integrated Circuits
1.1.1 Impact of Process Scaling on Power Density and Thermal
Gradients
CMOS devices are scaled by changing several critical device parameters such as physical
dimensions, supply voltages, and clock frequencies, at each technology node. The scal-
ing factors are typically chosen in order to maintain a constant electric field within the
MOSFET’s channel. For decades, this mechanism has resulted in constant improvements
in the integration level, cost, speed, compactness, power consumption, and functionality of
CMOS devices [12]. At recent technology nodes, however, feature size reduction yields
very high power consumption, and equivalent scaling techniques such as multi-core pro-
cessing are becoming standard.
The power consumption in CMOS devices can be expressed as [13–16]:
Pavg = Pdynamic + Pstatic. (1.1)
Pdynamic results from charging and discharging capacitive nodes and short circuit current
during switching. Pstatic is the static power consumption which results from various leak-
age currents, as well as contention in ratioed logic [13]. Figure 1.1 shows projections from
the International Technology Roadmap for Semiconductors (ITRS) [12] for static and dy-
namic power in consumer portable and stationary devices. The increasing dynamic power












































Figure 1.1: Power projections for (a) consumer portable and (b) consumer stationary
systems-on-chip (SOCs) by the International Technology Roadmap for Semiconductors
(ITRS) [12].
















































Figure 1.2: Supply voltages and gate lengths vs. year [12].
the number of processing elements in consumer stationary devices (such as desktop com-
puters) is expected to grow exponentially through 2024 [12]. The associated complexity
of these devices gives rise to very large RC networks for global routing and clock distri-
bution. Increased static power is the result of leakage currents caused by small dimension
effects such as drain-induced barrier lowering (DIBL), punchthrough, gate oxide tunneling
(gate leakage), etc. The growth of gate leakage has slowed with the introduction of high κ
dielectric materials into the gate stack. Subthreshold leakage is now the largest source of
3

















where µ0 is the low-field mobility, Cox is the oxide capacitance, W and L are the MOS-
FET channel dimensions, m is the subthreshold slope factor, Vg is the gate voltage, Vth
is the threshold voltage, VT is the thermal voltage, and VDS is the drain-to-source volt-
age. Threshold voltage must scale proportionally to supply voltage in order to maintain
drive capabilities. However, the exponential relationship between Isub and Vth dictates that
threshold voltage and, as a result, supply voltage scale at a decelerating rate. Higher supply
voltages also yield better noise margins and device switching frequencies [18]. Figure 1.2
shows the decay of supply voltage and gate length. The gate length decay rate is larger (ap-
proximately three times) than that of the supply voltage. This creates very strong electric
fields in MOSFET channel regions, saturating carrier velocities and increasing interaction
of carriers with ions and interface layers. This increased interaction or mobility degradation
increases the effective channel resistance in MOSFET devices, further increasing dynamic
power consumption.
Reduced feature sizes, increased integration levels, non-ideal power supply scaling,
and the overall increase in power consumption in integrated circuits leads to increased
power density (W/cm2). Much of the energy lost in, for example, carrier collisions is
dissipated as heat energy. Therefore, the high power densities give rise to large amounts of
heat generation. This fact, combined with the areal heterogeneity of power consumption
in commercial ICs as well as the relatively slow lateral diffusion of heat in silicon [19]
yields localized hot spots, or local temperature maxima. For example, Figure 1.3 shows the
thermal profile for an Intel Core Duo processor measured using on-chip thermal diodes.
Several failure mechanisms such as electromigration and stress migration in intercon-
nects, time-dependent dielectric breakdown (TDDB), and thermal cycling accelerate with
4
Figure 1.3: Thermal profile of an Intel Core Duo processor [20].
temperature [18]. Migration and TDDB have an exponential temperature dependence [18].
From (1.2) it can be seen that unmanaged temperatures can create a temperature/leakage
power feedback loop, yielding thermal runaway. High temperatures can also create timing
errors and clock skew [21], and affect carrier mobility and threshold voltage in MOSFETs
[17]. Therefore, it is very important to monitor and manage on-chip temperatures in order
to maximize device lifetimes and assure computational correctness. Temperature can also
be used as an observable test output for determining defective IC components [22, 23]. In
a many-core platform, chip hotspots are workload-dependent. In order to maximize per-
formance and reliability in these devices, tasks should be scheduled in a thermally-aware
manner such that hotspot temperatures do not exceed some threshold value and thermal
gradients are minimized. In general, the goal of thermal management will be to maximize
device performance while minimizing temperature gradients. The next section discusses
several methods that have been proposed to sense and manage on-chip temperatures in
CMOS devices.
1.1.2 Dynamic Thermal Management
Reducing temperatures and thermal gradients can be achieved through thermal-aware de-
sign or dynamic thermal management (DTM). In thermal-aware design, materials, physical
structures, and floorplans are chosen such that thermal gradients are minimized. For exam-
ple, the authors of [24] propose a grid structure to evenly distribute heat across the IC via
lateral diffusion. Another example of thermal-aware design is the placement of L2 cache
5
between cores in a multi-core system to thermally insulate them from each other [19]. This
thesis, however, will focus on dynamic thermal management.
Dynamic thermal management in ICs can roughly be split into two domains: trigger-
ing and response mechanisms [25]. The goal of a triggering mechanism is to measure
or estimate on-chip temperatures and trigger a hardware or software-level response which
is a function of those temperatures. Temperature measurements are achieved with ana-
log or digital on-chip temperature sensors. On-chip temperatures can also be indirectly
estimated through static compile-time code profiling or high-level dynamic performance
analysis [25]. Purely indirect estimations lack any real temperature feedback and can only
yield relative temperature information. Therefore, they are not appropriate for applications
where absolute temperature estimations are required. This thesis will focus on temperature
measurement triggering methods.
The goal of a response mechanism is to maximize device reliability while minimizing
performance degradation. In this domain, certain actions may need to take place in order
to reduce hot spot temperatures or minimize thermal gradients. The main tradeoff in this
domain is system reliability vs. performance. The rest of this section will give an overview
of some prior work that addresses each of these domains.
DTM Triggering with Temperature Measurement
Direct temperature measurement exploits the temperature dependence of physical device
or material parameters such as junction voltages, threshold voltages, resistivity, reflec-
tion coefficients, transmission coefficients, etc. Semiconductor diodes have been widely
used as on-chip temperature sensors [20, 26]. The diode’s junction voltage is temperature-
dependent. This voltage can be measured using an analog-to-digital converter. Noise re-
moval and other post processing may also be required to improve the measurement accu-
racy [20]. In [23], a temperature sensor based on an operational transconductance amplifier
6
is proposed. Other types of sensors based on optics and physical contact have been pro-
posed, but are not easily integrated into packaged ICs [27]. Indirect temperature measure-
ments with, e.g., digital temperature sensors exploit similar device physics, but typically
use digital counters to measure secondary temperature effects such as delay.
On-chip temperature sensors have non-zero area overhead, which limits the number of
sensors that can be used. Due to design complexity, the locations where sensors can be
placed is also restricted. As a result, there have been several research efforts to optimize
the location of on-chip temperature sensors and to reconstruct temperature profiles from a
discrete and often spatially non-uniform set of temperature data. In [28], a static sensor
placement mechanism is proposed based on k-means clustering which requires static pre-
diction of hotspot locations. Long et al. [19] showed that static sensor placement based
on clustering methods may not be appropriate for chip multiprocessors (CMPs) because
of thermal interaction between cores. Furthermore, the time, workload, and process vari-
ation dependence of hotspots makes predicting hotspot locations infeasible for many-core
systems. Instead, the authors propose a grid-based uniform sensor placement scheme that
utilizes interpolation methods to minimize the number of on-chip sensors and increase mea-
surement accuracy. Temperature reconstruction from limited sensor data based on spatial
Fourier analysis has also been studied [29]. Due to design and computation overheads, as
well as the need for static hotspot predictions, most of the above profiling methods are not
scalable to 1000-core systems.
DTM Responses
DTM response mechanisms are triggered by the thermal profiles discussed in the last sec-
tion. As stated above, the goal of the DTM response is to maximize the reliability of the
device (by reducing temperatures or thermal gradients) while imposing minimal perfor-
mance degradation. Responses can be categorized into thresholded and non-thresholded
mechanisms. In a thresholded mechanism, a response will only take place once a given
threshold thermal profile has been exceeded. Non-thresholded mechanisms, on the other
7
hand, continuously respond to new thermal profiles. In this case, the response is adapted to
the severity or characteristics of the thermal profile.
Hardware-level DTM responses scale the system performance in order to reduce power
consumption and temperatures. For example, dynamic voltage/frequency scaling (DVFS)
exploits the quadratic and linear relationships between supply voltage, clock frequency,
and dynamic power consumption [13]. In 2000, Burd et al. showed that DVFS can yield
large power savings in microprocessors [30]. In [31], the authors show that the steady-state
thermal responses to DVFS voltage/frequency pairs are proportional to the corresponding
power levels, making DVFS an effective DTM response. Other hardware-level DTM re-
sponses scale performance at the microarchitectural level. Fetch and decode throttling,
speculation control, and instruction cache toggling have also been studied as thermal man-
agement responses [25]. Coarse-grained DVFS and microarchitectural DTM responses
incur high performance costs. System or OS-level DTM responses such as thermal-aware
task scheduling can take advantage of finer-grain temperature data to yield responses with
smaller performance penalties [32]. Therefore, a dense grid of sensors combined with
fine-grained response mechanisms would be ideal for minimizing the performance costs of
dynamic thermal management.
1.2 Emerging CMOS/Memristor Hybrid Architectures
Due to the scaling limitations of CMOS devices, several research efforts are underway
to identify new materials, fabrication processes, devices, architectures, and computing
paradigms which will enable scaling beyond the end of the CMOS roadmap. The time-
line for those efforts can approximately be divided into three periods: equivalently scaled
CMOS, CMOS/nano hybrid, and post-CMOS. In the equivalently scaled CMOS period,
new materials and FET structures will allow for further dimensional scaling of CMOS
devices. Figure 1.4(a) shows ITRS projections for equivalent scaling. High κ dielectric
materials are integrated into the gate stack to improve drain current while reducing gate
8
tunneling current. Multiple gate FET structures and gate all around (GAA) structures can
be used to improve electrostatic control of the FET channel. New materials are also being
explored to improve carrier mobility in the MOSFET channels [12]. In the post-CMOS
period, unique properties of nanoelectronic devices will be leveraged to create completely
novel architectures, computation paradigms, and data representation. Figure 1.4(b) shows
possible post-CMOS state variables, materials, devices, data representations, and architec-
tures. Most of these post-CMOS technologies, however, are at least a decade away from
being commercially viable CMOS replacements.
Molecular electronics devices and architectures have become a widely studied post-
CMOS technology [33–44]. Since Aviram and Ratner’s seminal paper on molecular recti-
fiers [45], two-terminal molecular devices have been identified which exhibit rectification
[43], electrically-configurable bistability [37], and negative differential resistance (NDR)
[33]. The integration of these devices to form logic circuits has also been described in
[35, 38]. Several proposed fabrication processes for these devices are based on chemical
self-assembly, where devices, circuits, and architectures are built in a bottom-up process.
In order to overcome the alignment issues and high defect rates associated with bottom-
up processes, the crossbar architecture was proposed, where two perpendicular nanowire
planes sandwich a switching layer.
The crossbar architecture enables extremely high density, connectivity, and addressabil-
ity for circuits based on two-terminal nanodevices. Assuming a wire pitch P , a crossbar
circuit will have a density of 1/P 2. Assuming that P is equal to twice the minimum fea-
ture size F , then the density becomes 1/4F 2 [46]. In an M × N crossbar circuit, each
nanodevice is directly connected to M +N − 2 neighboring nanodevices, and may be (de-
pending on the type of nanodevice) indirectly connected to even more nanodevices. This
high connectivity allows all MN nanodevices in the crossbar circuit to be addressed using
only M +N interface connections.
Diode-resistor logic, and FET-based programmable logic arrays based on the crossbar
9
2009 ITWG Table Timing: 2007 2010 2013 2016 2019]
PDSOI FDSOI
bulk
























Figure 1.4: ITRS (a) equivalent scaling projections and (b) new information processing
technologies [12].
architecture have been proposed [47, 48]. However, due to the difficulty of implement-
ing complex operations such as latching and inversion in the crossbar architecture, most
crossbar-based logic implementations require supporting CMOS circuitry. CMOS circuits
are also required for several proposed defect mapping and built-in self-test algorithms for
10
Figure 1.5: CMOS/nano hybrid architecture.
nanowire crossbar-based circuits [49–52]. Several architectures based on the combination
of CMOS devices with nanoscale crossbar circuits have been proposed [2, 3, 53–55]. For
the rest of this document, these architectures will be referred to as CMOS/nano hybrid ar-
chitectures. High-level design choices for CMOS/nano architectures include the physical
placement of the nano structures in relation to CMOS structures, as well as the interface
between the two. In [56], the authors suggest the fabrication of nanowire crossbars on top
of CMOS layers as a back-end CMOS process as in Figure 1.5. In the CMOL [2] and
FPNI [3] architectures, interfacing between crossbars and CMOS layers is achieved with
area-distributed vias.
Recently, the integration of thin-film memristors with CMOS circuits [1] has been pro-
posed (CMOS/memristor hybrid), especially for high-density memory [4–11]. Several
groups have shown memristor-based memory to have high density, fast write speeds, and
low volatility, making it an ideal replacement for memory at every level of the hierarchy.
CMOS/nano hybrid architectures are expected to suffer from the same thermal manage-
ment issues as pure CMOS architectures. CMOS layers will be affected in similar ways as
11
described in the last section. Nano layers will be affected from temperatures and thermal
gradients emanating from CMOS layers, as well as those from their own power consump-
tion. High temperatures will increase the resistivity of nanowire crossbars and will affect
the switching characteristics of two-terminal nano devices.
1.3 Summary and Contributions
The integration of CMOS/memristor hybrid RRAM with multicore CMOS processors may
be commercially realizable within three years. However, the large thermal gradients that
exist in ultra-scaled CMOS devices will adversely affect the reliability and lifetime of these
architectures. The specific objectives of this thesis are to
• Design a CMOS/memristor hybrid RRAM architecture capable of temperature sens-
ing
• Develop a simulation framework (Crosstherm Framework) to analyze the effects of
temperature variation in CMOS/memristor hybrid architectures
• Test the thermal profiling capabilities of the RRAM architecture using the developed
framework
The rest of this document is outlined as follows: Chapter 2 gives a brief overview of mem-
ristance and memristive systems, as well as the temperature dependence of thin-film mem-
ristors. Chapter 3 discusses the application of thin-film memristors to CMOS/memristor
hybrid RRAM architectures. In Chapter 4 a CMOS/memristor RRAM design is modified
to add temperature sensing capabilities. The temperature sensing RRAM (TSRRAM) is ca-
pable of profiling the thermal characteristics across the entire chip. Chapter 5 discusses the
simulation framework developed to test the TSRRAM architecture and outlines the chosen
simulation parameters. Simulation results and analyses are presented in Chapter 6, and
Chapter 7 discusses conclusions and future work.
12
Chapter 2
Memristance and Memristive Systems
The memristor was postulated in 1971 by Chua as the fourth fundamental circuit ele-
ment [57]. Chua argued that there must exist a passive device that relates charge and flux,
where the proportionality factor is called memristance. He later generalized the concept
of memristance to a class of non-linear dynamical systems called memristive systems. He
showed that several dynamical systems, including those involving ionic transport, could be
modeled as memristive systems [58]. It wasn’t until 2008, however, that a passive device
was proposed as a physical realization of the memristor [59]. Since then, memristors have
been studied for a wide range of applications, including reconfigurable logic [60], feedback
systems [61], neuromorphic systems [62, 63] and non-volatile memory [4–11]
This chapter provides a brief overview of memristance and memristive systems. Section
2.1 gives an introduction to the concept of memristance. Sections 2.2–2.4 discuss thin-film
memristors, models, and their temperature dependence. Finally, Section 2.5 summarizes
this chapter.
2.1 Overview of Memristance and Memristive Systems
The concept of the memristor (memory resistor) was introduced by Leon Chua via
fundamental symmetry arguments. Chua argued that there are six possible relationships
between the four fundamental circuit variables–voltage (v), current (i), charge (q), and flux
(φ). These relationships are shown in Figure 2.1 [57], where R is resistance, L is induc-
tance, C is capacitance, and M is memristance. In the case of linear circuit components,
13
Figure 2.1: Four fundamental circuit quantities and their relationships with each other.
R, L, C, and M are proportionality constants and M is analogous to R. In the case of
non-linear circuit components, however, R, L, C, and M are functions of their defining
independent variables: R = α(i), L = β(i), C = σ(v), and M = γ(q).
In [58], Chua and Kang generalized memristors into a class of non-linear dynamical
systems called memristive systems, described by the following equations:
dx
dt
= f(x, u, t)
y = g(x, u, t)u
(2.1)
In (2.1), u is the system input, y is the system output, and x is the system state variable.
The distinction between memristive systems and other dynamical systems is the fact that
the output y is always zero when the input u is zero, resulting in zero-crossing Lissajous
y − u curves. Thermistors, ionic systems, and discharge tubes have all been identified as
memristive systems [58].
In the case of electrical memristive systems, the memristive system input and output can
be voltage or current, and the g function can be a generalized resistance or conductance.







Table 2.1: Switching processes in thin-film memristors.
Switching Process References
Ionic Diffusion
O2 vacancy drift [59]
Metal filament formation/retraction [4, 6, 7, 66, 67]
Schottky barrier breakdown at oxide/metal interface [68]




Mechanical switching in Rotaxanes [39]
where v is the voltage across the memristor, i is the current through the memristor, x is the
state variable, R is a generalized resistance dependent on x, and f describes how the state
variable changes as a function of the voltage as well as the current state [58].
2.2 Thin-film Memristors
Hysteretic resistance switching in thin films has been observed for over 40 years [64, 65].
In 2008, Strukov et al. showed that the resistance switching behavior in vacancy-doped
titanium dioxide thin films could be described by an electrical memristive system [59].
Thin film materials that exhibit resistance switching in an applied electric field have become
known as thin-film memristors. The exact physical processes that cause the switching
phenomena depend on the materials and fabrication processes.
In [59], the switching process is described by drift of oxygen vacancies within the ac-
tive layer. In [4, 6, 66, 67], conductive channels were formed through amorphous silicon by
the diffusion of metal ions from the electrodes. Similarly, in [7], gold ions form a switch-
able conduction channel through a manganese-doped zinc layer. In [68], the resistance
switching in metal/oxide/metal layers is described by the breakdown of a Schottky bar-
rier at the oxide/metal interface. In [69], memristive switching is explained by tunneling
barrier modulation. The underlying mechanism of each of these is ionic diffusion. That
is, the distribution of ions within the thin film changes in order to change the memristor’s
15
Figure 2.2: Thin-film memristor in a nanowire crossbar configuration. Red dots repre-
sent mobile ions which can drift or diffuse moving an effective domain wall w. The ion
distribution defines the resistivity profile along the memristor’s active region.
resistance. Memristive behavior in thin films based on spin blockade has also been studied
[70]. The resistance switching behavior in monomolecular films such as Rotaxanes [39],
can also be described in terms of memristive systems. This thesis, however, will focus only
on thin-film memristors based on ionic diffusion because of the abundance of experimental
demonstrations and models.
2.3 Thin-film Memristor Models
The form and complexity of thin-film memristor models is governed by the materials, fabri-
cation methods, and physical processes involved in the memristor’s switching. This section
discusses two thin-film memristor models that have been used widely in the literature to de-
scribe several types of thin-film memristive devices.
2.3.1 Linear Model
The linear memristor model was first proposed by Strukov et al. in [59] to describe the
memristive resistance switching observed in vacancy-doped TiO2. The linear model is
easily described with the aid of the thin-film memristor in Figure 2.2. The memristor active
region is contacted at each end by perpendicular wires (crossbar configuration). D is the
16
thickness of the film, L and W define the cross-sectional area. Red dots represent mobile
ions which can drift or diffuse in the presence of an electric field or concentration gradient.
An arbitrary ion concentrationNIw can be defined as the concentration threshold between a
doped and undoped region of the memristor’s active region. In Figure 2.2, this NIw would
correspond to a domain wall location w. The ion distribution also defines the resistivity
profile shown to the right of the memristor. The domain wall location w defines the barrier
between high resistivity and low resistivity regions of the memristor.








vm(t) = [Ronx+Roff (1− x)] im(t)
(2.3)
where x = w/D is the state variable, µI is the ion mobility,Roff is the memristor resistance
when x = 0, Ron is the memristor resistance when x = 1, vm(t) is the terminal voltage,
and im(t) is the current through the memristor.1 Solving (2.3) for im(t) with the initial
condition x(t = 0) = x0, and appropriate boundary conditions yields
im(t) = W (φ)vm(t), (2.4)












In (2.5), r = Roff/Ron and B = ((r − 1)/2)x20 − rx0. As D approaches infinity, the
memductance reduces to the initial conductance, (Ronx0 + Roff (1− x0))−1. However, as
D becomes small, the memductance becomes quadratically more dependent on the flux.
As a result, both memductance and memristance are important phenomena in small-feature
1See Appendix A.1 for a complete derivation of the linear ionic drift model.
17






























Figure 2.3: Thin-film memristor I-V curves. Parameters: µI = 1 × 10−13 m2/Vs, Ron =
100 Ω, r = 160, D = 10× 10−9 m, x0 = 0.1, w0 = 10π [59].
devices, especially thin films.
Figure 2.3 shows simulation runs of a thin-film memristor with µI = 1×10−13 m2/Vs,
Ron = 100 Ω, r = 160, D = 10× 10−9 m, and x0 = 0.1. The input voltage is a sine wave
with an amplitude of 1 V and base frequency w0. Simulation runs for different voltage
source frequencies are shown. Curves corresponding to higher input frequencies show
less hysteresis than those corresponding to lower frequencies. Although the above model
gives valuable insight into the thin-film memristor’s operation, it fails to include several
important phenomena such as nonlinear ionic drift velocity, temperature effects, and ion
diffusion.
18
The authors of [73] improve the model presented above by adding the effects of ion self-
diffusion, internal electric fields between charged ion species, temperature, and physical






where JI is the ion flux, which is given as







In (2.7), µI(T ) and DI(T ) are the temperature-dependent mobility and diffusivity, and
V (x, t) is the potential along the memristor’s active region. The diffusivity depends on the
crystal geometry a, jump frequency f , and ion activation energy EA as [74]







where kB is Boltzmann’s constant. The mobility and diffusivity are related by the Nernst-





where qI is the ion charge. In the case of high electric fields, the diffusivity becomes
exponentially proportional to the field [76]. The model presented in (2.6)-(2.9) provides
a better physical representation of the memristor’s behavior, but it requires a numerical
solution and, therefore, it is not appropriate to use with circuit simulation tools like SPICE.
Another approach to modeling non-linear dopant drift at memristor boundaries is to add a







where F (x) = 0 at x = 0 and x = D. References [59, 72, 77, 78] propose several different
19




























































































Figure 2.4: Thin-film memristor diffusion temperature dependence.
forms for F (x). Several groups have proposed SPICE-compatible models based on the
model given in (2.3) [77–80]. Other models based on empirical analyses have also been
proposed [81].
2.3.2 Exponential Model
The linear memristor model assumes that electron transport through the memristor is lin-
early proportional to the memristor state and the memristor voltage. Other models, how-
ever, suggest that an exponential relationship exists between electron current and the ap-
plied memristor voltage [6, 68, 82]. In general, the form of these models is
im = αsinh(βvm) (2.11)
where α and β depend on the memristor state. The exponential relationship between cur-
rent and voltage is typical in electron tunneling through a thin potential barrier [83, 84].
Therefore, α and β are proportional to the tunneling barrier width and height.
20


















Figure 2.5: Memristor write time versus temperature.
2.4 Thin-film Memristor Temperature Dependence
Most of the closed-form models discussed above fail to take into account the effects of tem-
perature on thin film memristors’ switching characteristics. Since ion diffusion is a ther-
mally activated process, thin-film memristors demonstrate a three-fold temperature depen-
dence. (1) The first dependence is that of the ion diffusivity, given in (2.8). To demonstrate
the diffusivity temperature dependence, the drift-diffusion model in (2.6)-(2.9) was simu-
lated with EA = 0.9 eV, D = 10 nm, x0 = 0.1, a = 0.15 nm, and f = 1013 Hz [76]. In all
of the simulations, vm(t) = 0 and the internal electric field is ignored. Several simulation
results are shown in Figure 2.4. Figure 2.4(a) shows the redistribution of dopant ions in the
memristor’s active region over time at 300 K. Figure 2.4(b) shows x versus temperature and
time. For a fixed time, the domain wall location is a monotonically increasing function of
temperature. Figure 2.4(c) shows operation regions in which the thin-film memristor could
21
be used as a diffusion-based temperature sensor. In the non-thresholded region, the mem-
ristor’s domain wall location will map to a unique temperature. In the thresholded region,
the memristor’s domain wall will reach an extreme before the sensing period has elapsed,
so it will only be able to determine if a threshold temperature was reached. (2) The second
temperature dependence is from the ion mobility, given in (2.9). Figure 2.5 demonstrates
the mobility dependence. The simulation parameters are the same as the ones described
above, except EA = 0.5 eV. The time that it takes to move the domain wall state x from
0 to 1 is plotted against temperature. The exponential dependence suggests that memristor
write time can be used to infer temperature. (3) The last temperature dependence is that of
electron transport through the memristor in a given state, described in [85].
2.5 Summary
Memristive devices and systems are capable of encoding their past inputs into a state vari-
able, making them a natural choice as a memory device. In 2008, a practical demonstration
of memristance in vacancy-doped TiO2 created an explosive growth in research on thin-
film memristors and CMOS/memristor hybrid architectures. As will be discussed in the
next two chapters, the temperature dependence and non-volatility of thin-film memristors
offers the possibility of designing dual-purpose resistive random access memory for tem-
perature sensing and data storage.
22
Chapter 3
Resistive Random Access Memory
Flash memory and static random access memory (SRAM) are approaching fundamental
scaling limits. Charge leakage in flash memory and the effects of subthreshold leakage
and soft errors in SRAM are major reliability concerns. Consequently, novel materials,
architectures, and ways of representing data are being proposed for non-volatile memory.
Several new technologies, including phase-change memory (PCRAM), ferroelectric mem-
ory (FeRAM), magnetic memory (MRAM), molecular memory, carbon nanotube-based
memory, and resistive random access memory (RRAM) are being proposed [86]. RRAM
has been shown to have high endurance, long retention times, high off/on resistance ratios,
and fast programming speeds [7]. Furthermore, since RRAM is not a charge-based mem-
ory, it will be much less susceptible to soft errors. RRAM based on hysteretic resistance
switching in thin-film memristors can be integrated into crossbar circuits for extremely
high memory density. All of these characteristics make thin-film memristor-based RRAM
a strong candidate for replacing memory at all levels of the hierarchy.
The rest of this chapter outlines the operations, architectures, and challenges associ-
ated with RRAM. Section 3.1 discusses how a single thin-film memristor can be used to
store/retrieve a single bit of data. Section 3.2 presents a general 2-level CMOS/Memristor
RRAM architecture and discusses each of its components. Section 3.3 presents the prob-
lem of sneak paths in the context of RRAM, and discusses several proposed solutions.
Multi-level RRAM architectures, where single memristors are use to store multiple bits,
are briefly discussed in Section 3.4. Finally, Section 3.5 summarizes this chapter.
23
Figure 3.1: Thin-film memristor as a 2-level RRAM element.
3.1 Single Memristor as 2-Level RRAM Element
A single memristive device can be used as a 2-level RRAM element by storing either a
high or low resistance value. This concept is illustrated in Figure 3.1. In the write phase,
a voltage pulse is applied to change the memristor’s resistance to a high or low value. In
the read phase, these resistance values can be transformed into CMOS-compatible voltage
levels to represent a zero or one. The rest of this section describes these operations in detail.
3.1.1 Read Operation
Reading the information stored in a memristive device requires the transformation of its
state variable x into a measurable physical quantity Qm(x). This concept is illustrated
in Figure 3.2. The state of the memristor can be read by measuring the current through
the memristor or by determining its resistance relative to a pull-down resistor. A third
measurable quantity is proposed in [87], where the authors studied memristors based on
ionic transition metal complexes. They showed that the memristor current is proportional
to light emission from radiative exciton recombination in the active region. This offers the
possibility of optical readout of the memristor state.
In the case of thin-film memristors, where x = w/D, the measured quantity Qm will be
a function of the resistance memristor’sRm. In general, some type of interface circuitry will
24
Figure 3.2: Transforming the thin-film memristor’s state variable to a measurable physical
quantity.
be required to perform the measurement of Qm and output the corresponding logic level.
In a CMOS/memristor hybrid device, these logic levels will correspond to the CMOS logic
levels. For two-level RRAM, a comparator can be used to compare a reference voltage with
the voltage drop across RPD. The pull-down resistor should be carefully designed in order
to maximize distinguishability between states [8]. Analog-to-digital (ADC) converters can
also be used to convert the memristor’s state into a digital value.
In order to read the memristor state, a voltage vm must be applied to the memristor,
causing the desired measurable circuit quantity Qm to become non-zero. It is important
that this voltage does not change (destroy) the memristor state. One way to ensure this,
is to apply both a positive and negative voltage pulse during the read operation so that
the net applied flux φ(t) is zero. Examining (A.12), if φ(t) is zero, then x(t) is equal to
the initial state x0. However, when the memristor domain wall is in one of the extreme
states (x = 0 or x = 1, depending on the polarity), this scheme will not work [8]. In
general, the domain wall velocity will be asymmetric: It will be faster in one direction
than it is in the other [69, 73]. As a result, the positive or negative voltage pulse will
have to have a longer duration in order to maintain the initial memristor state. Any noise
during the read operation or domain wall motion due to ionic diffusion will cause undesired
changes in the memristor state. In [8], a refresh scheme is proposed to periodically rewrite
25
the memristor to the correct state. Another approach is to use a safety margin, where
intermediate resistance values are treated as undefined memory values.
3.1.2 Write Operation
In the single 2-level RRAM element write operation, a positive or negative flux needs to
be applied to change the memristor’s resistance value. Assuming a constant voltage write







(x20 − x2f ) + (r1 + r2)(xf − x0)
)
(3.1)
where x0 and xf are the initial and final states, r1 = Roff/Ron, and r2 = RPD/Ron. In the
case of 2-level RRAM, x0 and xf would ideally correspond to either 0 or 1.
In order to improve the write speed and corresponding write energy of the RRAM
element, the authors of [10] propose using partial memristor programming. In partial pro-
gramming, only a portion of the total memristor resistance range Roff − Ron is used. In
Figure 3.1, this idea corresponds to using the gray regions of the memristor to represent
logic levels low and high. However, partial programming will reduce the distinguishability
between the the logic high and logic low Qm values. To mitigate this effect, the authors
propose the use of two memristors to represent a single memory bit [10].
3.2 2-level RRAM Architectures
CMOS/memristor RRAM architectures combine memristor crossbar circuits with addi-
tional CMOS circuitry to yield high-density non-volatile RRAM. Figure 3.3 shows a gen-
eralized CMOS/memristor hybrid RRAM architecture. An M × N crossbar array is used
as the storage medium. As discussed in Section 1.2, crossbar circuits offer extremely high
density and addressability. The thick purple box represents the CMOS/nano interface. All








Figure 3.3: General CMOS/memristor RRAM block.
box is implemented using non-CMOS paradigms (e.g., nanowire crossbars).
Row and column selection circuits, such as multiplexers/demultiplexers, along with an
address decoder are used to isolate a single memristor from the crossbar. To reduce the area
overhead of the CMOS/nano interface, the authors of [88] propose a demultiplexer design
that can address n!/[w!(n−w)!] nanowires with only n CMOS wires. A read/write control
circuit applies a read or write voltage depending on the value of r̄w. Two tristate buffers
and an enable signal are used to isolate the RRAM block when it is not being used.
3.3 Current Sneak Paths
One of the greatest challenges associated with the crossbar architecture is reducing or elim-
inating current sneak paths. In order to read or write a single memristor in a crossbar cir-
cuit, current should flow through only that memristor’s corresponding row and column.
For example, in Figure 3.4, to read or write the blue memristor, a voltage is applied to
the memristor’s corresponding row, and the corresponding column is grounded. Ideally,
current only flows along the white path through the blue memristor. However, since the
memristor elements are purely resistive, current may take parallel paths, or sneak paths,
27
from the voltage source to ground. Figure 3.4 shows one such path in red. In general, mul-
tiple sneak paths will exist when attempting to address any of the elements of a crossbar
circuit. Furthermore, the number of sneak paths is proportional to the size of the crossbar
circuit.
In the context of RRAM, sneak paths have two different effects. During a write oper-
ation, the parallel current paths will cause unintentional writing of several memristors. In
Figure 3.4, the yellow memristors may unintentionally be written when trying to write a
bit to the blue memristor. During a read operation, sneak paths can cause the state of a
memristor to be misinterpreted. For example, if the memristor in blue is off, the current
sensed at the grounded column should be relatively low. However, the additional current
flowing through that column due to the sneak paths may cause the memristor to be read as
on.
Several different devices, architectures, and read/write procedures have been proposed
to mitigate the effect of sneak paths in crossbar circuits. In [46], a memristive device
composed of two antiserial memristors is proposed to improve crossbar read margins. Ref-
erence [9] compares a 1-diode 1-memristor (1D1M) architecture and an unfolded crossbar
architecture for sneak path mitigation. The 1D1M architecture assumes that each mem-
ristor is in series with a rectifying component such as a pn junction diode, eliminating
sneak paths. However, there have been limited demonstrations of such devices. The un-
folded crossbar architecture isolates addressed memristors using M :1 multiplexers on each
column (where M is the number of crossbar rows). In [89], a three-step read process is
proposed for determining the state of a crossbar memristor even with the presence of sneak
paths. Sneak paths can also be eliminated by utilizing 1×N crossbar circuits.
3.4 Multi-level RRAM Architectures
The resistance range of thin-film memristors can be divided into several subranges as shown
in Figure 3.5. This allows multiple bits to be stored in a single memristor for increased
28
Figure 3.4: Illustration of current sneak paths in a memristor crossbar circuit.
Figure 3.5: Thin-film memristor as a multi-level RRAM element
memory density. The ith memory level is represented resistance rangeRi−RiL toRi+RiH .
Ri is the nominal resistance value for the level, and RiL and RiH are added for noise
margins. Undefined safety margins can be added to detect unintentional changes of the
memristor’s state. Several groups have proposed multi-level RRAM architectures based on
thin-film memristors [9, 90, 91]. In general, though, multi-level RRAM architectures are
much more complex than 2-level RRAM architectures to account for different write times
and process variations. As the number of levels increases, multi-level RRAM becomes
more susceptible to noise caused by read and write cycles, as well as sneak paths. Further-




Resistive random access memory based on thin-film memristor crossbars exhibit the high
density, speed, and non-volatility required of a next-generation memory technology. Fur-
thermore, memristor-based RRAM will most likely be the first commercializable CMOS/memristor
hybrid architecture. The strong dependence of RRAM’s write time on temperature offers
the possibility of building temperature sensing capabilities into a CMOS/memristor hybrid
architecture. The next chapter describes how the CMOS/memristor RRAM architecture in
Figure 3.3 can be modified to add temperature sensing capabilities.
30
Chapter 4
Temperature Sensing Resistive Random
Access Memory Design
The emergence of CMOS/memristor hybrid architectures and the strong temperature de-
pendence of thin-film memristors was established in the previous chapters. This thesis
leverages that temperature dependence and proposes a thermal profiling scheme for CMOS
devices with vertically-stacked memristor-based RRAM. Figure 4.1 gives a high-level il-
lustration of the proposed design, which consists of three layers. In the bottom layer,
state-of-the-art CMOS processes are used to define processing elements, caches, etc. The
top layer is a dual-purpose temperature sensing/memory (DTS) layer. It contains arrays of
nanowire crossbars with thin-film memristor junctions. This layer serves two purposes:
• To store data in the form of a resistance
• To sense the temperature of different locations on the chip
The intermediate layer serves as an interface between the CMOS and DTS layers. Here,
techniques such as area-distributed interfacing and demultiplexing are used to provide sup-
port for addressing each crosspoint junction in the DTS layer from the CMOS layer. A
temperature sensing resistive random access memory (TSRRAM) will be designed to sup-
port memory storage and temperature sensing in the DTS layer.
31
Figure 4.1: Proposed dual-purpose temperature sensing/memory architecture for thermal
profiling in CMOS/memristor hybrid architectures.
The rest of the this chapter will discuss Figure 4.1 in more detail. Section 4.1 discusses
the design of a bit-addressable TSRRAM block. Section 4.2 describes how TSRRAM
blocks are combined to form byte-addressable TSRRAM groups. In Section 4.3, the top-
level TSRRAM design and controller are discussed. Section 4.4 discusses the TSRRAM
design space and presents an improved TSRRAM design. Sections 4.5 and 4.6 describe
the calibration process and two fundamentally different methods for using the TSRRAM
architecture. Section 4.7 summarizes this chapter.
32
4.1 Temperature Sensing RRAM Block
A TSRRAM block provides a bit-addressable RRAM memory with added support for tem-
perature sensing. The three temperature dependencies (ion mobility, ion diffusivity, and
carrier transport) of thin-film memristors were discussed in Chapter 2. In general, the write
time of a thin-film memristor will depend on all three of these. Part of the interface circuitry
required for measuring the memristor write time will already be present for the RRAM read
circuit. Therefore, using the write time to extract temperature data will have relatively little
overhead in the TSRRAM block.
Another approach would be to write the memristor to a high resistance state and then
measure the average velocity of the domain wall over some period of time. An analog-
to-digital (ADC) converter would be needed to extract the velocity information. In this
approach, there is also a conflict of interest. For memory storage and, specifically, high
retention times, the ion activation energy of the thin-film memristor needs to be sufficiently
high. However, a high activation energy will yield very small domain wall movements
due to ionic self-diffusion. Therefore, this approach is not best-suited for a dual-purpose
design.
Figure 4.2 shows the TSRRAM block design. In the top-right corner, anN×M crossbar
array is used as the physical storage medium. Each memristor stores a single bit of data
in the form of a high or low resistance. The crossbar array is fabricated as a back-end
CMOS process and is part of the DTS layer. The purple square around the crossbar array
represents the CMOS/nano interface (the middle layer in Figure 4.1). All of the circuitry
outside of the purple box is fabricated using conventional CMOS processing. Multiplexers
are used to select rows or columns depending on the given address. For M = 1, the









Figure 4.2: M ×N TSRRAM block design.
4.1.1 Idle and Read Operation
When the TSRRAM block is idle, the enable signal, en, should be low. This cuts off the
TSRRAM block from the data bus and grounds both terminals of the currently-selected
memristor. In the read operation, en should be high and r̄w should be low. This selects
the read voltage, vr, to be applied to the positive terminal of the memristor at the row
and column specified by the address. The read voltage should be small enough so that it
doesn’t disturb the state of the addressed memristor. The resulting voltage across Rpd col is
compared to a reference voltage. The reference voltage is given by the voltage division
vref = vrow
Rpd ref
Rref i +Rpd ref
(4.1)
where vrow is the voltage applied to the selected crossbar row, and Rref i is either Rref r,
Rref w0, orRref w1, depending on the operation. AssumingRpd ref = Rpd col, thenRref i =
Rm(x), where x is the desired domain wall location that separates logic low from logic
34
Figure 4.3: TSRRAM block timer finite state machine.
high. In the case of the read operation, a reasonable boundary is at x = 0.5, so Rref r =
0.5Ron + 0.5Roff .
4.1.2 Write Operation
In the write operation, en and r̄w should be high. A write voltage will be selected depending
on the data to be written. A positive write voltage is selected if the data signal is high,
which will make the memristor domain wall move into a low resistance state. A negative
write voltage is selected if the data signal is low, which will make the memristor domain
wall move into a high resistance state.
4.1.3 Temperature Sensing
All of the blue components in Figure 4.2 are required for the TSRRAM block to operate as
an RRAM block. The additional red components provide temperature sensing capabilities
35
Figure 4.4: B-bit TSRRAM group.
to the TSRRAM block. The temperature is estimated by measuring the write time of a
specific memristor. The existing read circuit is used to detect when a memory bit has been
written. The finite state machine (FSM) for the timer is shown in Figure 4.3. After a
reset, the timer is in the STOPPED state. Here, the count value is fixed at its last value.
When the ts signal transitions to a logic high value, the state machine transitions to either
the START0 or START1 state, depending on the current value of the memory bit being
written. For example, if the memory bit is being written from 0 to 1, then the FSM will
transition into the START0 state. Once the memory value is written, the read circuit output
will transition to a logic high, and the timer will be stopped. In the case of a memory bit
being written from 1 to 0, the read circuit’s initial output will be 1. When the negative write
voltage vw0 is applied to the row the comparator output will switch to 0, and after the bit is
written, the comparator output will switch back to 1. This behavior accounts for the slight
difference in the sequence of states in Figure 4.3 when writing from 0 to 1 or 1 to 0.
4.2 Temperature Sensing RRAM Group
Conventional memory designs are byte- or word-addressable instead of bit-addressable.
In general, a memory can be B-bit-addressable. Writing B bits to a single TSRRAM
block would require B write cycles. It would also require the memory control circuitry
36
to have a state machine which would write each bit of the B-bit processor register (e.g.,
memory commit register) sequentially to the TSRRAM block. The same issue arises with
a B-bit memory read operation. The corresponding memory latency would have a severe
performance impact. However, by utilizing groups of B TSRRAM blocks, each B-bit
transaction can take place as B parallel 1-bit transactions.
A B-bit TSRRAM group is illustrated in Figure 4.4. A B-bit data bus, originating at a
processor register, is split between the B M ×N TSRRAM blocks. The r̄w, en, addr, ts,
clk, and rst signals are identically shared by each TSRRAM block. The error and count
outputs from each TSRRAM block are combined to form buses, allowing the temperature
information from each block to be determined from a single memory write. The total width
of the signal path between the processor and a TSRRAM group, excluding the global clock
and reset signals, will be
Wp↔m = B(α + 2) + log2(MN) + 3 (4.2)
The signal path width can be reduced in a variety of ways, each with a performance or
temperature coverage tradeoff. For example, a subset of the TSRRAM blocks in the group
can be designated as for temperature sensing, and the rest can be plain RRAM blocks.
4.3 Temperature Sensing RRAM
A TSRRAM combines several TSRRAM groups, as shown in Figure 4.5. Since only one
B-bit word will be written or read at a time, only one TSRRAM group needs to be enabled
at a time. Therefore, all of the signals can be identically shared between the TSRRAM
groups as long as each group has its own enable signal. Now, the new signal path width is
Wp↔m = B(α + 2) + log2(MN) +G+ 3 (4.3)
The TSRRAM architecture is controlled by a TSRRAM control circuit, which acts as an
37
Figure 4.5: TSRRAM architecture.
Figure 4.6: Top-level architecture.
interface between the TSRRAM and the processor(s). Figure 4.6 shows the top-level ar-
chitecture. A processor core will start a read or write transaction by pulsing the tx signal,
which will cause the the TSRRAM controller (Figure 4.7) to enable one of the TSRRAM
groups and begin the data transfer. Data writes will have a latency equal to the worst-case
write time, which will be equal to the write time at room temperature, 300 K.
The temperature register and its finite state machine are shown in Figure 4.8. After
a write transaction is completed, the TSRRAM controller’s FSM pulses the write done
signal, which causes the temperature register FSM to load all of the timer data, the last
write address, and the current time into its registers in parallel. Then, each timer value
is sequentially translated to an absolute temperature. A valid bit signals when all of the
38
Figure 4.7: TSRRAM controller finite state machine.
(a) (b)
Figure 4.8: (a) Temperature register finite state machine and (b) temperature register.
timer data has been converted. The temperature register serves as a snapshot of the last
temperature measurement.
39
Figure 4.9: TSRRAM design space.
4.4 TSRRAM Design Space Exploration
The TSRRAM architecture presented in the last section can be improved in several ways. In
general, the TSRRAM architecture’s design space is three-dimensional, as shown in Figure
4.9. Design complexity and area overhead refer to the amount of additional logic and rout-
ing needed to implement the TSRRAM architecture. Temperature error and coverage refer
to the accuracy of the sensed temperatures, and the areal temperature coverage. Write time
latency refers to the number of extra clock cycles required by the TSRRAM to implement
a write transaction. Five points, A, B, C, D, and E are shown in the design space. Point A
corresponds to the design presented in the last section. All of the other points correspond
to architectures that attempt to improve upon A. Furthermore, each point is normalized to
point A.
Architecture B improves upon A by utilizing a B-bit dedicated bus to route timer data,
rather than the nB-bit bus in architecture A. With a dedicated bus, architecture B will still
have no extra write time latency caused by the temperature measurement. However, extra
control logic will be needed to enable the timer data bus to be used for serial transmission
40
of the timer data from a previous write transaction while a new write transaction is taking
place. Architecture C eliminates a dedicated timer data bus, and uses the memory data bus
to transmit timer data serially, greatly reducing the routing complexity. However, n extra
clock cycles will be added to the write time latency to transmit the timer data. Architecture
D designates a subset of the B TSRRAM blocks in a group to be temperature sensing,
reducing the size of the required timer data bus. However, this also reduces the temperature
coverage, and more write transactions will be required to achieve a high resolution thermal
profile. Finally, architecture E moves the n-bit timers from the TSRRAM blocks into the
temperature registers. In the case where G > 1, this will greatly reduce the number of
timers required, because B timers can be shared by all of the groups. However, with extra
logic and routing between the TSRRAM block’s read circuit and the timer, the accuracy of
the temperature measurements may be significantly diminished.
Architecture C will be implemented in this thesis work. Figure 4.10 shows the modified
TSRRAM block architecture, which allows timer data to be serially transmitted on the data
bus. An extra signal from the timer circuit, ren, switches the data bus driver between the
read circuit output and the MSB of the count value. Figure 4.11 shows the modified timer
FSM. The count value is now reset in the STOPPED state. A SHIFT state is also added.
In this state, the count value is shifted left at each clock cycle, and the data bus is driven
by the MSB of the count value. In order to reduce the write time latency, the TSRRAM
controller may choose to only shift/transmit the most significant ns bits, where ns < n.
However, this will reduce the accuracy of the temperature measurement. Figures 4.12-4.15
show the modifications to TSRRAM group architecture, the TSRRAM architecture, the
TSRRAM controller and finite state machine, and the temperature register and finite state
machine. The modified TSRRAM FSM routes timer data serially from the TSRRAM to the








Figure 4.10: Improved TSRRAM block.
transforms the timer data into temperature data.
4.5 Calibration
Due to the combination of process variations and material defects, each memristor’s write
time at a specific temperature will vary. For example, in the linear ionic drift memristor
model, the write time, (3.1) depends on the memristor’s on and off resistances, the film
thickness, and the ion mobility. As operating temperatures increase, the memristor write
time will also become more dependent on the ionic diffusion coefficient. Figure 4.16 shows
the variation in memristor write speed versus several varying process parameters. The
write speed is most sensitive to the film thickness because of the write time’s quadratic
42
Figure 4.11: Improved TSRRAM block timer finite state machine.
dependence on D.
A calibration process will be required to mitigate the effect of process variations on
TSRRAM temperature measurements. Due to the extra design and test costs associated
with sensor calibration, many commercial processors incorporate uncalibrated sensors [92].
In this work, a calibration process is assumed as follows. In a post-fabrication step, the chip
will be placed in a thermally controlled environment, where it will be subjected to a range
of temperatures. For each temperature Ti, the average of the write times of each memristor
t̄w will be calculated. Then, t̄w will be mapped to Ti in the temperature LUT.
Controlling the temperature of the chip during calibration could be challenging, espe-
cially if the TSRRAM needs to be periodically re-calibrated. An alternative method is to
run benchmark programs with varying maximum temperatures to heat up the chip. High-
accuracy CMOS temperature sensors can then be used to measure the temperature asso-
ciated with memristor write times during calibration. However, this calibration method
would map memristor write times to CMOS layer temperatures. Therefore, the thermal




Figure 4.12: Improved (a) TSRRAM group and (b) TSRRAM architectures.
Figure 4.13: Improved top-level architecture.
temperature of the crossbar layer.
44
Figure 4.14: Improved TSRRAM controller finite state machine.
(a) (b)
Figure 4.15: Improved (a) temperature register finite state machine and (b) temperature
register.
4.6 Active and Passive Sensing
The TSRRAM architecture can be used to sense temperatures both actively and passively.
In active temperature sensing, a software or hardware-level DTM algorithm would write
data to certain memory locations in order to determine a global temperature profile, or to
examine fine-grained temperature statistics for a specific region of the chip. Active sensing
45

































Figure 4.16: Memristor write speed variation.
schemes should first read the data at a specific memory location, then write to it for a
temperature measurement, and then write the original data back. During this entire process,
context switching should be disabled to ensure that intermediate data modifications do not
affect other tasks.
The intrinsic write transactions that exist in a set of processor tasks can also be used to
gather thermal profile data using the TSRRAM. Assuming that several tasks are running
on a single processor, the spatial distribution of memory write accesses should look ap-
proximately random, yielding temperature measurements that cover the entire chip. This
method is preferred because there is no possibility of data corruption, and temperature mea-
surements can be taken in parallel with task write transactions. However, this approach will
not work when the processor is idle. In that case, active sensing can be used, or thermal
46
profiling can be halted.
Both active and passive sensing mechanisms will be prone to noise from various sources
such as process variations, measurement quantization, and random noise. However, the
large number and low latency of the memristor temperature sensors enables spatial and
temporal sampling to be used to filter out the effects of these noise sources. For example, a
single address in the TSRRAM can be written to several times in rapid succession in order
to find a time average of the temperature at that location. Using the mean temperature from
that sequence of measurements will filter out random noise. Furthermore, the high density
of temperature sensors in the crossbar layer allows a small window of sensors to be used
to measure the temperature at a single location on the chip. Sensor errors due to random
process variations can be mitigated by taking several measurements in the area surrounding
the desired sensor location and finding the statistical mean or median value.
4.7 Summary
This chapter presented TSRRAM, a CMOS/memristor hybrid memory architecture capable
of temperature sensing. Data are stored in crossbar circuits spatially distributed over CMOS
processor cores. Temperature information can be extracted from any part of the chip by
measuring the write time of the TSRRAM at an address corresponding to that location.
Due to process variations, a calibration process will be required in order to map timing
data to temperature information. Active and passive sensing provide two fundamentally
different ways to utilize the architecture, based on the load of the system. The next chapter




This chapter discusses the simulation setup used to test the TSRRAM architecture. Section
5.1 discusses a custom simulation framework designed as part of this thesis work. Sec-
tion 5.2 shows the specific design parameters that were simulated. Section 5.3 lists all
assumptions that were made for simulations. Section 5.4 summarizes the chapter.
5.1 Crosstherm Simulation Framework
Part of this thesis work developed a simulation framework for analyzing the effects of
temperature variation in CMOS/memristor architectures. Figure 5.1 shows a high-level
block diagram of the Crosstherm simulation framework. The framework is divided into
three main modules. The first is the processor thermal simulation tool chain, which pro-
vides space and time-dependent thermal profiles for a given processor architecture and
benchmark programs. SimpleScalar 2.0 [93], a well-known microarchitectural simulator,
provides switching activities for a given benchmark program and processor type. This
work uses the Alpha 21364 processor core [94] with removed L2 cache for all simula-
tions. Wattch [95] provides power estimation from the switching activities generated in
SimpleScalar. Finally, HotSpot [96] estimates thermal profiles based on the power data
from Wattch. HotSpot was modified to report grid-level (default is block-level) thermal
data every 10000 CPU cycles. The second module simulates the memristor partition of the
design, which is implemented in an Analog/Mixed Signal (AMS) language. In this work,
48
Figure 5.1: High-level block diagram of the Crosstherm simulation framework.
all designs were implemented in Verilog AMS. Cadence AMS designer is a flexible mixed
signal simulation environment that compiles, elaborates and simulates mixed language de-
signs. A Python script was also written to automatically generate Verilog AMS source
files based on different options and design parameters. This allowed designs to be highly
generic and easily customizable.
5.2 Design Parameters
Each of the three modules in the Crosstherm framework is highly customizable. Table
5.1 shows the key design parameters used for simulation. The ion jump frequency, ion
jump distance, ion charge, film thickness, and on and off resistances were taken from [76].
49
Table 5.1: Design parameters.
Parameter Symbol Value
Ion Jump Frequency f 10 THz
Ion Jump Distance a 0.15 nm
Ion Activation Energy EA 0.18 eV
Ion Charge qI +2 e
Film Thickness D 10 nm
ON Resistance Ron 100 Ω
OFF Resistance Roff 16 kΩ
Crossbar Rows M 1
Crossbar Columns N 1024
Column Pull-down Resistor Rpd col 1 kΩ
Reference Pull-down Resistor Rpd ref 1 kΩ
Read Reference Resistor Rref r 8050 Ω
Write 0 Reference Resistor Rref w0 15.2 kΩ
Write 1 Reference Resistor Rref w1 4279 Ω
CMOS Voltage vCMOS 3 V
Read Voltage vr 0.5 V
Write 0 Voltage vw0 -10 V
Write 1 Voltage vw1 10 V
TSRRAM Group (Word) Size B 8
Number of TSRRAM Groups G 32
Clock Frequency fclk 1 GHz
Simulation Timestep N/A 100 ps
The activation energy and write voltages were chosen to reflect write times on the order of
those presented in experimental devices. The assumption made in the model chosen for this
work was that the ion drift velocity is linear in the electric field (vI = µIE). However, as
Strukov et. al discuss in [76], vI becomes exponentially dependent on E in very thin films.
However, since the models based on this phenomena are not yet well-developed, this work
assumed low activation energies in the realm of superionic crystals. As will be discussed
in Chapter 7, future work will explore the temperature dependence of different memristor
models.
The parameters M , N , B, and G were chosen to achieve a 4 kB memory size with
sneak path elimination. The read reference voltage is chosen to be the resistance at which
50




















Figure 5.2: Choosing Rref w0 and Rref w1.
the memristor domain wall is 0.5. Rref w0 and Rref w1 were chosen such that even with a
5% variation in Ron or Roff (1) the timer circuit would still be able to detect the memristor
switching states, and (2) the measured write time from 0 to 1, tw0→1, would be the same as







































Figure 5.3: CMOS and crossbar layer floorplans.
been written to a memristor. The points that satisfy (5.1) are plotted in Figure 5.2 along
with the constraints for 5% Ron and Roff variation tolerance. Rref w0 and Rref w1 were
chosen from the red point, which satisfied all three constraints. The clock frequency was
chosen to be 1 GHz. A 0.5 V read voltage was used for non-destructive reading. The
simulation timestep was chosen to be 10% of the clock frequency.
The physical locations of the crossbar circuits with respect to the Alpha floorplan are
shown in Figure 5.3. Horizontal green lines indicate crossbar rows, which are distributed
evenly across the Alpha 21364 core. Crossbar columns are also distributed evenly across
each row, but are not shown for clarity. The address 0x000 corresponds to the top left of
52
the Alpha floorplan, and 0xFFF corresponds to the bottom right. The blue dots indicate an
example word striped across crossbars within the second group.
5.3 Assumptions
A few assumptions were made while simulating the TSRRAM architecture. First, all com-
ponents were assumed to be ideal. No parasitic RC delays were simulated. Furthermore,
only memristors were assumed to be affected by temperature. The temperature depen-
dence in all other components was assumed to be negligible. Heat generation within the
crossbar layer and the thermal insulation between CMOS and memristor layers were ig-
nored. Finally, the TSRRAM was assumed to be an L2 cache with globally random/locally
sequential access patterns.
5.4 Summary
This chapter discussed the simulation environment and chosen design parameters for test-
ing the TSRRAM architecture. Crosstherm, a custom simulation environment was designed
to perform all simulations. Simulation parameters were chosen for a 4 kB cache. Memris-
tors in the crossbar layer were distributed evenly across the top of the CMOS layer. The




The TSRRAM design was simulated in the Crosstherm framework using the design pa-
rameters and assumption discussed in the last chapter. Benchmarks were taken from the
SPEC2000 suite [97]. Power trace data from SPEC2000 benchmarks were taken from pre-
vious work [98]. Table 6.1 shows the simulated benchmarks. Two hot, one medium, and






one cold benchmark were simulated. Each benchmark was fast-forwarded to the last 500
million cycles [98]. Power trace data was run through HotSpot twice to simulate initial con-
ditions. The final 50 thermal cycles of each benchmark were used as input to the TSRRAM
Verilog AMS model.
6.1 Calibration Run
A calibration run was performed to generate the values in the temperature lookup table.
Address 0x000 was written to at controlled temperatures between 300 K and 400 K in 1
54







Figure 6.1: Calibration results.
K increments, and the corresponding write times were recorded. Figure 6.1 shows the re-
sults. The blue curve is the theoretical write time at each temperature, using the design
parameters from Table 5.1. The green circles correspond to the simulated write times at
each temperature. There is a small discrepancy between the theoretical and simulated write
times. The difference may be due to tw0→1 and tw1→0 being slightly different. The dif-
ference may also be due to small resistances and capacitances added to solve convergence
issues in the simulator, or simulator error. The simulated data were fit to a 10th-degree
polynomial (shown in pink), and the corresponding lookup table data were generated from
the fit. The theoretical maximum error due to time quantization was also calculated and
plotted (inset of Figure 6.1) using
55



































Figure 6.2: Number of passive temperature measurements for each thermal cycle.
Emax(T ) = T (dtw(T )− 1/fclke)− T (dtw(T )e) (6.1)
where T (tw) is the temperature corresponding to a write time tw, and tw(T ) is the write
time corresponding to temperature T . According to the results, the maximum error should
be < 4 K over the temperature ranges in the chosen benchmarks.
56
6.2 Passive Sensing Results
6.2.1 Sensor Accuracy
The passive sensing capabilities of the TSRRAM architecture were tested by implement-
ing a random mix of read and write instructions over the 50 simulated thermal cycles. Data
for each instruction were chosen at random, and addresses were globally random, locally
sequential, with a sequential period of 10. Figure 6.2 shows the number of temperature
measurements that were collected for each thermal cycle with the randomly generated set
of instructions. In general, the chosen set of instructions represents a memory-bound pro-
cessor running several different tasks. This is the best-case scenario for passive sensing. In
the case of CPU-bound loads, where few tasks are running, active sensing techniques will
need to be employed.
Figure 6.3 shows an example of the temperature measurements collected over one ther-
mal cycle of the GCC benchmark using passive sensing. The left plot shows the thermal
profile produced by HotSpot. The maximum temperatures (hotspots) of each functional
unit were calculated and plotted as well (black circles). The right plot shows the locations
and values of temperature measurements produced by passive sensing. For a memory-
bound load running several tasks, the thermal profile is almost completely characterized
via passive sensing.
The relative and absolute errors of TSRRAM temperature measurements for the GCC
benchmark are shown in Figure 6.4. The range of positive errors, where the measured
temperature was less than the actual temperature, is much larger than the range of negative
errors. This can be described by Figure 6.5. The stop signal for the timer goes high or
low when the memristor domain wall reaches xf0 or xf1. However, the change is not
registered until the next rising edge of the clock. Therefore, any measured write time will
be greater than or equal to the actual write time of the memristor (ideally), meaning that any
57
Figure 6.3: Passive sensing measurements for GCC benchmark over 1 thermal cycle.
temperature measurement will be less than or equal to the actual temperature. As a result,
all temperature errors due to time quantization should be positive. The negative errors are
likely a result of the calibration fit or simulation error. The sensor errors and maximum
errors versus temperature are plotted in Figure 6.6. Figure 6.6(a) shows the error versus
temperature for every sensor measurement. For a given actual temperature, the error is
quantized, resulting in the local lines of slope=1. The maximum absolute error in each 1 K
temperature interval is plotted in Figure 6.6(b). A general increase in maximum error can
be seen with increasing temperature, which is expected.
Error statistics for all of the simulated benchmarks are shown in Table 6.2. Results
58


































































Figure 6.5: Time quantization error.












































Figure 6.6: GCC (a) error vs. temperature (b) maximum error vs. temperature.
59
are consistent across all benchmarks, with the average error ≈ 2 K. The maximum errors
are likely a combination of (1) simulation error, (2) fitting error from calibration, and (3)
time quantization error. However, these large errors may be able to be corrected in DTM
algorithms.
Table 6.2: Passive sensing errors.
Benchmark Max. Error (K) Mean Error (K) Std. Dev. (K)
GCC 16.49 2.20 2.65
Art 15.06 1.98 1.87
Gzip 15.11 2.23 2.16
Parser 17.88 2.16 2.55
6.2.2 Analysis of DTM Memory Size
A DTM software task will make use of data in the temperature register by storing snapshots
in memory and using the data to derive a thermal profile. A snapshot consists of the contents
of the temperature register (address, timestamp, and temperatures) at a given time. New
snapshots will replace older snapshots in a FIFO structure. In general, the size (in bytes) of
the DTM memory will be
SDTM = NS(SI +B + SD) (6.2)
where NS is the number of snapshots stored in the FIFO structure, SI is the size (in bytes)
of an integer, and SD is the size of a double precision floating point number. An integer is
used to store the address. A single byte is large enough to hold all of the temperature mea-
surements, provided they are offset from a constant value (e.g., 275 K). A double precision
floating point number is used to store the timestamp of the temperature register snapshot.
A larger DTM memory will provide the DTM response algorithms with more data for
determining the current thermal profile at any instance. Thermal profiles can generally be
















Figure 6.7: Illustration of coverage metric.
Using passive sensing, a larger DTM memory will have a higher probability of yielding
temperature data from regions of the chip that are close to the hotspots. A sensor coverage
metric can be defined based on average difference in temperature between every hotspot
location and the measurement value of its nearest sensor. Figure 6.7 illustrates the idea.
Three hotspots are shown (red circles) with five sensor locations (blue squares). The dis-
tances between each hotspot and the nearest sensor are given by D1-D3. The maximum
distance between a hotspot and a sensor is also shown (Dmax). When every hotspot loca-
tion is shared with a sensor location, the coverage will be 100%. If no sensors exist, or the
sensors are located at the maximum distance from every hotspot, then the coverage will be
61






















































Figure 6.8: Coverage and mean hotspot errors for varying DTM memory sizes.
0%. Therefore, the coverage metric is defined as
coverage = 100%− Davg
Dmax
× 100% (6.3)
where Davg is the average distance between a hotspot and a sensor.
The coverage was determined for different DTM memory sizes over all benchmarks.
Fifteen hotspots (one per functional unit) were calculated for every thermal cycle. Figure
6.8(a) shows the mean coverage for each benchmark with DTM sizes varying from 1 to 100.
The coverage asymptotically approaches 100% as the DTM memory size is increased. The
mean hotspot error, which is the average temperature difference between every hotspot and
the closest sensor measurement, is plotted in Figure 6.8(b). It is clear that very small DTM
memory sizes will yield larger mean hotspot errors, making thermal profile reconstruction
more difficult.
Larger DTM memory sizes yield higher average coverage and lower mean hotspot er-
rors. However, each snapshot in a DTM memory will need to be checked periodically for
old timestamps. If a timestamp is too old, the DTM won’t be able to use the snapshot. For
very large DTM memories, this checking could create a large performance overhead.
62
Algorithm 1 Two-iteration active sensing algorithm for finding global hotspot.
MAX ADDR = M ×N ×G
NUM REGIONS = 8
LOW RES ADDR INCR = MAX ADDR/NUM REGIONS
HIGH RES ADDR INCR = LOW RES ADDR INCR/32
START ADDR = LOW RES ADDR INCR/2
// Get the initial low resolution profile
addr = START ADDR
while addr<MAX ADDR do
data read = READ(addr)
data write = NOT data read
write(addr,data write)
store(DTM MEMORY,temperature register snapshot)
write(addr,data read)
addr = addr + LOW RES ADDR INCR
end while
// Find the region with the maximum temperature
START ADDR = get max temp addr(DTM MEMORY)−LOW RES ADDR INCR/2
// Scan hottest region with increased resolution
addr = START ADDR
while addr<START ADDR + LOW RES ADDR INCR do
data read = READ(addr)
data write = NOT data read
write(addr,data write)
store(DTM MEMORY,temperature register snapshot)
write(addr,data read)
addr = addr + HIGH RES ADDR INCR
end while
// Find the region with the maximum temperature
HOTSPOT ADDR = get max temp addr(DTM MEMORY)
6.3 Active Sensing Results
When the processor is idle, or when specific temperature data points are needed, active
sensing may need to be employed. The implementation of active sensing in the TSRRAM
63
Figure 6.9: Finding global hotspot in GCC benchmark–low sensor resolution.
was demonstrated by iteratively scanning the chip for a global hotspot over one thermal
cycle of the GCC benchmark. An outline of the algorithm is given in Algorithm 1. The
first iteration of the active sensing routine took temperature measurements in eight different
sections of the chip, from left to right and top to bottom. The choice of eight is somewhat
arbitrary. In general, the spatial sampling rate will depend on the characteristics of the
thermal gradient. The initial low-resolution measurements are shown in Figure 6.9.
An active sensing algorithm would look at the initial low-resolution thermal data and
determine that the upper-right octant of the chip may have a hotspot. The algorithm could
then adjust the resolution of its scan (from left to right) over that region of the chip to
pinpoint the hotspot. The higher resolution scan is shown in Figure 6.10. The effect of
64
Figure 6.10: Finding global hotspot in GCC benchmark–higher sensor resolution.






















































Low Res. High Res.
(b)
Figure 6.11: Effect of active sensing resolution on (a) hotspot coverage and (b) absolute
hotspot error.
65
the active sensing resolution (spatial sampling rate) is shown in Figure 6.11. In the ini-
tial low-resolution sweep, the coverage reaches a maximum value of 87.8% (Figure 6.10).
However, when the top right octant of the chip is swept at a finer resolution, the cover-
age increases to 100%. Similarly, the absolute error improved from 2.5 K to 0.49 K when
changing from low-resolution to high-resolution sensing. The flexibility of being able to
dynamically change the spatial sampling rate of temperature sensors is a novel feature in the
TSRRAM architecture, and may be able to yield dramatic improvements in DTM response
mechanisms.
6.4 Performance Overheads
Different performance overheads are incurred by using either active or passive sensing
techniques. In active sensing, data are written to specific locations for the purposes of
extracting temperature. As a result, an active sensing introduction requires four steps. First,
the data at the desired sensing location has to be read. Then based on that data, new data
is generated for the write instruction. The simplest way to generate the new data would be
to take the bitwise complement of the original data. In general, the number of temperature
data points that can be obtained from any write instruction is equal to the Hamming distance
between the original data and the written data. The final steps are writing the modified data
and writing back the original data to the sensed memory location. This would yield a write
instruction that can produce the most temperature information. Therefore, the number of
cycles required to perform each active sensing write instruction is
Cactive = RL+ CP + 2WL (6.4)
where RL is the memory read latency, CP is the number of cycles spent processing (i.e.,
getting the bitwise complement), and WL is the memory write latency. For the simulation
parameters used in this work, Cactive is 142 cycles. The performance overhead of active
sensing depends on when it is used. If the processor is idle, then active sensing incurs 0%
66
performance overhead. However, if the system is fully-loaded then active sensing will take
CPU time away from other tasks.
Passive sensing will, in general, have a much lower performance overhead than active
sensing. The number of cycles required to perform each passive sensing instruction is
equal to the write latency, WL. For the parameters used in this work, WL is 78 cycles.
The performance overhead of passive sensing is B/WL, since B extra cycles are needed
to transfer timer data from the TSRRAM blocks to the temperature register. Therefore, the
performance overhead of passive sensing in this work was (8/78)× 100% = 10.3%. This
overhead could be reduced by using a dedicated bus to transfer all timer data in parallel.
6.5 Summary
This chapter presented simulation results of the TSRRAM, simulated in the Crosstherm
framework. Passive sensing was simulated by implementing random read and write in-
structions in the TSRRAM, with thermal profile inputs coming from four different bench-
marks. A coverage metric was proposed for determining the effect of DTM memory size
on the ability of DTM algorithms to extract a thermal profile. Active sensing was also sim-
ulated, showing that the resolution of thermal data can be dynamically adjusted to increase
coverage and reduce hotspot errors. Finally, the performance overheads of both passive and
active sensing schemes were discussed.
67
Chapter 7
Conclusions and Future Work
7.1 Conclusions
This thesis work developed a method for thermal profiling in CMOS/memristor hybrid
architectures. A hybrid CMOS/memristor temperature sensing RRAM (TSRRAM) was
designed which uses the write time of thin-film memristors to measure the temperature
in specific regions of the chip. Two sensing methods–active and passive sensing–are in-
troduced as alternative ways to extract temperatures from the TSRRAM design. Active
sensing allows dynamic thermal management units to measure temperatures at high resolu-
tions at desired locations across the chip. Passive sensing utilizes the write instructions that
exist in the currently running task to measure temperatures at random locations. A simu-
lation framework was also developed for analyzing the effects of temperature variation in
CMOS/memristor hybrid architectures. The framework combines processor thermal char-
acterization tools (SimpleScalar, Wattch, and HotSpot) with the Cadence AMS simulation
environment.
The TSRRAM design was implemented in Verilog AMS and tested using the devel-
oped simulation framework. Passive sensing was simulated for four benchmarks from the
SPEC2000 CPU suite. The average sensor error across all benchmarks is 2.14 K, and the
maximum error is 16.49 K. The mean error is well within the useful range of DTM re-
sponses, such as thermal-aware task scheduling. The effect of varying the memory size
of DTM units while using passive sensing was also analyzed. Larger memory sizes yield
68
temperature measurement data which better represents all of the hotspots on the chip. The
performance overhead of passive sensing for the chosen design parameters was shown to
be 10.3%. Dynamically adjusting the spatial sampling rate during active sensing was also
demonstrated. 100 % coverage can be achieved for a single global hotspot by increasing
the sensing resolution during active sensing. As a result, the absolute error between the
hotspot temperature and closest sensor was shown to decrease from 2.5 K to 0.49 K.
7.2 Future Work
Several avenues exist for extending this thesis work. Here, we have assumed that the thin-
film memristors in the TSRRAM follow a linear ionic drift model, where the ion drift
velocity is linear in the applied electric field. Future work could explore the temperature
sensing capabilities of the TSRRAM using more accurate models (such as those based on
exponential ionic drift) or models that represent memristors composed of different materi-
als, switching mechanisms, etc. In this work, only the effect of temperature on thin-film
memristors was considered. Furthermore, only one of the temperature dependencies in
thin-film memristors, ion mobility, was considered. An extension of this work could con-
sider the effects of temperature in crossbar nanowires as well as CMOS components. The
temperature dependence of ion diffusion and electron transport, as well as the effect of heat
generation in the crossbar layer can also be considered in future models.
In this work, sneak paths in the crossbar circuits were eliminated by considering 1×N
and M × 1 crossbar circuits. Future work can explore the use of general M ×N crossbar
circuits with different sneak path mitigation techniques and the effect on the accuracy of
the TSRRAM. Sensor accuracy could also be improved using high-speed clocks for TSR-
RAM block timer circuits. Finally, this work focused on DTM triggering mechanisms.
Specifically, a method thermal profiling in CMOS/memristor RRAM using temperature
measurement was presented. Another extension of this work could explore higher level
DTM triggering and DTM response mechanisms. Specifically, some avenues are (1) using
69
interpolation or smoothing algorithms to correct sensor errors, (2) algorithms for switching
between active and passive sensing based on the CPU workload, and (3) storing statistical




A.1 Linear Ionic Drift Model
Let x ∈ [0, 1] = w/D. The on state of the memristor is that in which x = D, and the off
state is that in which x = 0. Since electron and hole mobility will vary between the doped
and the undoped active regions, the resistivity of those regions will also be different. Let
the resistance of the memristor in the on state be Ron and the resistance of the memristor
in the off state be Roff . The generalized resistance is then [71]
R(x) = Ronx+Roff (1− x) . (A.1)
When a voltage is applied via source V = v(t), dopants drift in the the resulting lateral
electric field E, which is dropped across the doped region. The most simple ionic drift
model assumes that dopant ions have a constant mobility, or that the dopant drift velocity
is linearly proportional to the lateral electric field. Therefore, vI = µIE, where vI is the
ion drift velocity and µI is the average ion mobility. Since vI = Ddx/dt the change in x














For linear ion drift, Von = Ronxi(t), where i(t) is the current through the memristor at time







v(t) = (Ronx+Roff (1− x)) i(t)
(A.4)





q(t) + A, (A.5)
where A is the integration constant. Assuming q(0) = 0 [72], A = x0, the value of x at




(x+ r (1− x)) dx
dt
(A.6)
where r = Roff/Ron. Integrating both sides with respect to time gives
∫


























where B is the integration constant. Assuming the boundary condition φ(0) = 0, B =





(r − 1) q2(t) +Ron (r − x0 (r − 1)) q(t) (A.8)
72





(r − 1) q(t) +Ron (r − x0 (r − 1)) (A.9)
Solving (A.8) for q(t) yields
q(t) =
D2 (r − x0 (r − 1))




1− 2µI (r − 1)
















The 1/D2 factor in (A.11) indicates that for large film thicknesses, memristance is negligi-
ble [59].
The voltage transfer characteristics (VTC) for the memristor are found by first solving









































i(t) = W (φ)v(t) (A.15)
74
Bibliography
[1] D. Strukov, D. Stewart, J. Borghetti, X. Li, M. Pickett, G. Ribeiro, W. Robi-
nett, G. Snider, J. Strachan, W. Wu, Q. Xia, J. Yang, and R. Williams, “Hybrid
CMOS/memristor circuits,” in Proc. IEEE Int. Symp. Circ. Syst., ISCAS 2010, Jun.
2010, pp. 1967–1970.
[2] D. B. Strukov and K. K. Likharev, “CMOL FPGA: a reconfigurable architecture for
hybrid digital circuits with two-terminal nanodevices,” Nanotechnology, vol. 16, pp.
888–900, 2005.
[3] G. S. Snider and S. R. Williams, “Nano/CMOS architectures using a field-
programmable nanowire interconnect,” Nanotechnology, vol. 18, no. 3, pp. 1–10,
2007.
[4] S. H. Jo and W. Lu, “CMOS compatible nanoscale nonvolatile resistance switching
memory,” Nano Lett., vol. 8, no. 2, pp. 392–397, 2008, pMID: 18217785. [Online].
Available: http://pubs.acs.org/doi/abs/10.1021/nl073225h
[5] D. B. Strukov and R. S. Williams, “Four-dimensional address topology for circuits
with stacked multilayer crossbar arrays,” in Proc. of the National Academy of Sci-
ences, vol. 106, no. 48, 2009, pp. 20 155–20 158.
[6] S. H. Jo, K.-H. Kim, and L. Wei, “Programmable resistance switching in nanoscale
two-terminal devices,” Nano Lett., vol. 9, no. 1, 2009.
[7] Y. C. Yang, F. Pan, Q. Liu, M. Liu, and F. Zeng, “Fully room-temperature-
fabricated nonvolatile resistive memory for ultrafast and high-density memory
application,” Nano Lett., vol. 9, no. 4, pp. 1636–1643, 2009. [Online]. Available:
http://pubs.acs.org/doi/abs/10.1021/nl900006g
[8] Y. Ho, G. Huang, and P. Li, “Nonvolatile memristor memory: Device characteristics
and design implications,” in Computer-Aided Design - Digest of Technical Papers,
75
2009. ICCAD 2009. IEEE/ACM International Conference on, Nov. 2009, pp. 485 –
490.
[9] H. Manem, G. S. Rose, X. He, and W. Wang, “Design considerations
for variation tolerant multilevel CMOS/nano memristor memory,” in Proc.
of the 20th symposium on Great lakes symposium on VLSI, ser. GLSVLSI
’10. New York, NY, USA: ACM, 2010, pp. 287–292. [Online]. Available:
http://doi.acm.org/10.1145/1785481.1785548
[10] D. Niu, Y. Chen, and Y. Xie, “Low-power dual-element memristor based memory
design,” in Proc. of the 16th ACM/IEEE international symposium on low power
electronics and design, ser. ISLPED ’10. New York, NY, USA: ACM, 2010, pp.
25–30. [Online]. Available: http://doi.acm.org/10.1145/1840845.1840851
[11] W. Robinett, M. Pickett, J. Borghetti, Q. Xia, G. S. Snider, G. Medeiros-Ribeiro,
and R. S. Williams, “A memristor-based nonvolatile latch circuit,” Nanotechnology,
vol. 21, no. 23, Jun. 2010.
[12] International Roadmap for Semiconductors, 2009 Report. [Online]. Available:
http://www.itrs.net/reports.html
[13] N. Weste and D. Harris, CMOS VLSI Design: A Circuits and Systems Perspective,
4th ed. USA: Addison-Wesley Publishing Company, 2010.
[14] W. Wolf, Modern VLSI Design, 3rd ed. Upper Saddle River, NJ, USA: Prentice Hall
PTR, 2002.
[15] C. G. Dimitrios Soudris, Christian Piguet, Ed., Designing CMOS Circuits for Low
Power. Kluwer Academic Publishers, 2002.
[16] L. Xiu, VLSI Circuit Design Methodology Demystified: A Conceptual Taxonomy.
Wiley-Interscience, 2008.
[17] Y. Tsividis, Operation and Modeling of the MOS Transistor. Oxford University
Press, Inc., 1999.
[18] J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers, “The impact of technology scal-
ing on lifetime reliability,” in Proc. Int. Conference on Dependable Systems and Net-
works, Jun. 2004, pp. 1–10.
76
[19] J. Long, S. O. Memik, G. Memik, and R. Mukherjee, “Thermal monitoring mecha-
nisms for chip multiprocessors,” ACM Trans. Archit. Code Optim., vol. 5, no. 2, pp.
1–33, 2008.
[20] E. Rotem, J. Hermerding, C. Aviad, and C. Harel, “Temperature measurement in the
intel core duo processor,” in Proc. of 12th Int. Workshop on Thermal investigation of
ICs, Sep. 2006, pp. 1–5.
[21] A. H. Ajami, K. Banerjee, M. Pedram, and L. P. P. P. van Ginneken, “Analysis of non-
uniform temperature-dependent interconnect performance in high performance ics,”
in DAC ’01: Proceedings of the 38th annual Design Automation Conference. New
York, NY, USA: ACM, 2001, pp. 567–572.
[22] P. Bratek and A. Kos, “Temperature sensors placement strategy for fault diagnosis in
integrated circuits,” pp. 245–251, 2001.
[23] J. Altet, A. Rubio, A. Salhi, J. Galvez, S. Dilhaire, A. Syal, and A. Ivanov,
“Sensing temperature in CMOS circuits for thermal testing,” 22nd IEEE VLSI
Test Symposium, 2004. Proceedings., pp. 179–184, 2004. [Online]. Available:
http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1299241
[24] J. L. Ayala, A. Sridhar, and D. Cuesta, “Thermal modeling and analysis of 3d
multi-processor chips,” Integration, the VLSI Journal, vol. 43, no. 4, pp. 327 – 341,
2010. [Online]. Available: http://www.sciencedirect.com/science/article/B6V1M-
50BJNF6-1/2/d078a5771e96e3c4ac9f90cbbe3c304a
[25] D. Brooks and M. Martonosi, “Dynamic thermal management for high-performance
microprocessors,” in High-Performance Computer Architecture, 2001. HPCA. The
Seventh International Symposium on, 2001, pp. 171 –182.
[26] E. Rotem, A. Naveh, M. Moffie, and A. Mendelson, “Analysis of thermal monitor
features of the intel pentium m processor,” in in Workshop on Temperature-aware
Computer Systems, 2004.
[27] D. Blackburn, “Temperature measurements of semiconductor devices - a review,” in
Semiconductor Thermal Measurement and Management Symposium, 2004. Twentieth
Annual IEEE, Mar. 2004, pp. 70–80.
77
[28] R. Mukherjee and S. O. Memik, “Systematic temperature sensor allocation and place-
ment for microprocessors,” in DAC ’06: Proceedings of the 43rd annual Design Au-
tomation Conference. New York, NY, USA: ACM, 2006, pp. 542–547.
[29] R. Cochran and S. Reda, “Spectral techniques for high-resolution thermal character-
ization with limited sensor data,” in Design Automation Conference, 2009. DAC ’09.
46th ACM/IEEE, Jul. 2009, pp. 478 –483.
[30] T. Burd, T. Pering, A. Stratakos, and R. Brodersen, “A dynamic voltage scaled mi-
croprocessor system,” in Solid-State Circuits Conference, 2000. Digest of Technical
Papers. ISSCC. 2000 IEEE International, 2000, pp. 294–295, 466.
[31] H. Hanson, S. Keckler, S. Ghiasi, K. Rajamani, F. Rawson, and J. Rubio, “Ther-
mal response to DVFS: analysis with an intel pentium m,” in Low Power Electronics
and Design (ISLPED), 2007 ACM/IEEE International Symposium on, Aug. 2007, pp.
219–224.
[32] J. Choi, C.-Y. Cher, H. Franke, H. Hamann, A. Weger, and P. Bose, “Thermal-aware
task scheduling at the system software level,” in Proceedings of the 2007
International Symposium on Low Power Electronics and Design, ser. ISLPED
’07. New York, NY, USA: ACM, 2007, pp. 213–218. [Online]. Available:
http://doi.acm.org/10.1145/1283780.1283826
[33] J. Chen, M. A. Reed, A. M. Rawlett, and J. M. Tour, “Large on-off ratios and negative
differential resistance in a molecular electronic device,” Science, vol. 286, no. 5444,
pp. 1550–1552, 1999.
[34] C. P. Collier, E. W. Wong, M. Belohradsky, F. M. Raymo, J. F. Stoddart, P. J. Kuekes,
R. S. Williams, and J. R. Heath, “Electronically configurable molecular-based
logic gates,” Science, vol. 285, no. 5426, pp. 391–394, 1999. [Online]. Available:
http://www.sciencemag.org/cgi/content/abstract/285/5426/391
[35] J. Ellenbogen and J. Love, “Architectures for molecular electronic computers. i. logic
structures and an adder designed from molecular electronic diodes,” Proc. IEEE,
vol. 88, no. 3, pp. 386 –426, Mar. 2000.
[36] C. Majumder, T. Briere, H. Mizuseki, and Y. Kawazoe, “Molecular resistance in a
molecular diode. a case study of the substituted phenylethynyl oligomer,” J. Phys.
Chem. A, vol. 106, no. 34, pp. 7911–7914, Jul. 2002.
78
[37] Y. Chen, D. A. A. Ohlberg, X. Li, D. R. Stewart, R. Stanley Williams, J. O. Jeppesen,
K. A. Nielsen, J. F. Stoddart, D. L. Olynick, and E. Anderson, “Nanoscale molecular-
switch devices fabricated by imprint lithography,” Appl. Phys. Lett., vol. 82, pp. 1610–
1612, Mar. 2003.
[38] M. Stan, P. Franzon, S. Goldstein, J. Lach, and M. Ziegler, “Molecular electron-
ics: from devices and interconnect to circuits and architecture,” Proc. IEEE, vol. 91,
no. 11, pp. 1940– 957, Nov. 2003.
[39] W.-Q. Deng, R. P. Muller, and W. A. Goddard, “Mechanism of the stod-
dartheath bistable rotaxane molecular switch,” J. Am. Chem. Soc., vol. 126,
no. 42, pp. 13 562–13 563, 2004, pMID: 15493882. [Online]. Available:
http://pubs.acs.org/doi/abs/10.1021/ja036498x
[40] P. J. Kuekes, D. R. Stewart, and R. S. Williams, “The crossbar latch: Logic value
storage, restoration, and inversion in crossbar circuits,” J. Appl. Phys., vol. 97, no.
034301, pp. 1–5, 2005.
[41] D. Nackashi, C. Amsinck, N. DiSpigna, and P. Franzon, “Molecular electronic latches
and memories,” in Nanotechnology, 2005. 5th IEEE Conference on, 11-15 2005, pp.
819–822 vol. 2.
[42] G. M. Morales, P. Jiang, S. Yuan, Y. Lee, A. Sanchez, W. You, and L. Yu, “Inversion
of the rectifying effect in diblock molecular diodes by protonation,” J. Am. Chem.
Soc., vol. 126, no. 30, pp. 10 456–10 457, 2005.
[43] M. J. Kumar, “Molecular diodes and applications,” Recent Patents on Nanotechnol-
ogy, vol. 1, no. 1, pp. 51–57, 2007.
[44] H. He, G. Mallick, R. Pandey, and S. Karna, “Mechanism of electrical rectification
in a unimolecular donor-bridge (π)-acceptor diode,” in Nanotechnology, 2007. IEEE-
NANO 2007. 7th IEEE Conference on, Aug. 2007, pp. 870–872.
[45] A. Aviram and M. A. Ratner, “Molecular rectifiers,” Chem. Phys. Lett., vol. 29, no. 2,
pp. 277–283, 1974.
[46] E. Linn, R. Rosezin, C. Kugeler, and R. Waser, “Complementary resistive switches for
passive nanocrossbar memories,” Nature Materials, no. 5, pp. 403–406, Apr. 2010.
79
[47] G. Snider, P. Kuekes, and R. S. Williams, “Cmos-like logic in defective, nanoscale
crossbars,” Nanotechnology, vol. 15, pp. 881–891, 2004.
[48] R. S. Chakraborty, S. Paul, and S. Bhunia, “Analysis and robust design of
diode-resistor based nanoscale crossbar PLA circuits,” in Proceedings of the
21st International Conference on VLSI Design, ser. VLSID ’08. Washington,
DC, USA: IEEE Computer Society, 2008, pp. 441–446. [Online]. Available:
http://dx.doi.org/10.1109/VLSI.2008.44
[49] M. Mishra and S. C. Goldstein, Defect tolerance at the end of the roadmap.
Norwell, MA, USA: Kluwer Academic Publishers, 2004, pp. 73–108. [Online].
Available: http://portal.acm.org/citation.cfm?id=1137939.1137947
[50] J. G. Brown and R. D. S. Blanton, “CAEN-BIST: Testing the nanofab-
ric,” in Proc. Int. Test Conf., ser. ITC ’04. Washington, DC,
USA: IEEE Computer Society, 2004, pp. 462–471. [Online]. Available:
http://portal.acm.org/citation.cfm?id=1116164.1116497
[51] M. Tehranipoor and R. Rad, “Built-in self-test and recovery procedures for molecular
electronics-based nanofabrics,” IEEE Trans. Computer-Aided Design of Integrated
Circuits and Systems, vol. 26, no. 5, pp. 943–958, May 2007.
[52] B. Zamanlooy and A. Ayatollahi, “Modified CAEN-BIST algorithm for better utiliza-
tion of nanofabrics,” in Electrical and Computer Engineering, 2008. ICECE 2008.
International Conference on, Dec. 2008, pp. 297–301.
[53] S. C. Goldstein and M. Budiu, “Nanofabrics: spatial computing using molecular
electronics,” SIGARCH Comput. Archit. News, vol. 29, pp. 178–191, May 2001.
[Online]. Available: http://doi.acm.org/10.1145/384285.379262
[54] A. DeHon and M. J. Wilson, “Nanowire-based sublithographic programmable logic
arrays,” in Proceedings of the 2004 ACM/SIGDA 12th international symposium on
field programmable gate arrays, ser. FPGA ’04. New York, NY, USA: ACM, 2004,
pp. 123–132. [Online]. Available: http://doi.acm.org/10.1145/968280.968299
[55] C. Teodorov, “Nanocad: Design automation methods for emerging nanoscale tech-
nologies. a survey of nanoscale computing architectures and associated cad tools.”
Master’s thesis, University of Bretagne Occidentale, 2008.
80
[56] M. Ziegler and M. Stan, “CMOS/nano co-design for crossbar-based molecular elec-
tronic systems,” IEEE Trans. Nanotechnol., vol. 2, no. 4, pp. 217–230, Dec. 2003.
[57] L. Chua, “Memristor-the missing circuit element,” IEEE Trans. Circuit Theory,
vol. 18, no. 5, pp. 507 – 519, Sep. 1971.
[58] L. Chua and S. M. Kang, “Memristive devices and systems,” Proc. IEEE, vol. 64,
no. 2, pp. 209–223, Feb. 1976.
[59] D. B. Strukov, G. S. Snider, D. R. Stewart, and S. R. Williams, “The missing memris-
tor found,” Nature, vol. 453, no. 7191, pp. 80–83, May 2008.
[60] Q. Xia, W. Robinett, M. W. Cumbie, N. Banerjee, T. J. Cardinali, J. J. Yang,
W. Wu, X. Li, W. M. Tong, D. B. Strukov, G. S. Snider, G. Medeiros-Ribeiro,
and R. S. Williams, “Memristor/CMOS hybrid integrated circuits for reconfigurable
logic,” Nano Lett., vol. 9, no. 10, pp. 3640–3645, 2009. [Online]. Available:
http://pubs.acs.org/doi/abs/10.1021/nl901874j
[61] M. Stork, J. Hrusak, and D. Mayer, “Memristor based feedback systems,” in Applied
Electronics, 2009. AE 2009, Sep. 2009, pp. 237 –240.
[62] S. H. Jo, T. Chang, I. Ebong, B. B. Bhadviya, P. Mazumder, and W. Lu,
“Nanoscale memristor device as synapse in neuromorphic systems,” Nano Letters,
vol. 10, no. 4, pp. 1297–1301, 2010, pMID: 20192230. [Online]. Available:
http://pubs.acs.org/doi/abs/10.1021/nl904092h
[63] F. Merrikh-Bayat and S. B. Shouraki, “Bottleneck of using single memristor as a
synapse and its solution,” eprint arXiv:1008.3450v2 [cs.NE], 2010.
[64] F. Argall, “Switching phenomena in titanium oxide thin films,” Solid-State Electron-
ics, vol. 11, pp. 535–541, 1968.
[65] J. Blanc and D. L. Staebler, “Electrocoloration in SrTiO3: Vacancy drift and
oxidation-reduction of transition metals,” Phys. Rev. B, vol. 4, no. 10, Nov. 1971.
[66] Y. Dong, G. Yu, M. C. McAlpine, W. Lu, and C. M. Lieber, “Si/a-
Si core/shell nanowires as nonvolatile crossbar switches,” Nano Lett.,
vol. 8, no. 2, pp. 386–391, 2008, pMID: 18220442. [Online]. Available:
http://pubs.acs.org/doi/abs/10.1021/nl073224p
81
[67] S. H. Jo, K.-H. Kim, and W. Lu, “High-density crossbar arrays based on a Si
memristive system,” Nano Lett., vol. 9, no. 2, pp. 870–874, 2009. [Online]. Available:
http://pubs.acs.org/doi/abs/10.1021/nl8037689
[68] J. J. Yang, M. D. Pickett, X. Li, D. A. A. Ohlberg, D. R. Stewart, and R. S. Williams,
“Memristive switching mechanism for metal/oxide/metal nanodevices,” Nature Nan-
otechnology, vol. 3, no. 7, 2008.
[69] M. D. Pickett, D. B. Strukov, J. L. Borghetti, J. J. Yang, G. S. Snider, D. R. Stewart,
and R. S. Williams, “Switching dynamics in titanium dioxide memristive devices,” J.
Appl. Phys., vol. 106, no. 7, pp. 074 508–074 508–6, Oct. 2009.
[70] Y. V. Pershin and M. Di Ventra, “Spin memristive systems: Spin memory effects in
semiconductor spintronics,” Phys. Rev. B, vol. 78, no. 11, p. 113309, Sep. 2008.
[71] F. Y. Wang, “Memristor for introductory physics,” eprint arXiv:0808.0286v1
[physics.class-ph], Aug. 2008.
[72] Y. N. Joglekar and S. J. Wolf, “The elusive memristor: properties of basic electrical
circuits,” Eur. J. Phys., vol. 30, no. 661, Jul. 2009.
[73] D. B. Strukov, J. L. Borghetti, and R. S. Williams, “Coupled ionic and electronic
transport model of thin-film semiconductor memristive behavior,” small, vol. 5, no. 9,
pp. 1058–1063, 2009.
[74] M. E. Glicksman, Diffusion in Solids: Field Theory, Solid-State Principles, and Ap-
plications. John Wiley & Sons, Inc., 2000.
[75] U. Weinert and E. A. Mason, “Generalized nernst-einstein relations for nonlinear
transport coefficients,” Phys. Rev. A, vol. 21, no. 2, 1980.
[76] D. Strukov and R. Williams, “Exponential ionic drift: fast switching and low
volatility of thin-film memristors,” Appl. Phys. A: Materials Science & Processing,
vol. 94, pp. 515–519, 2009, 10.1007/s00339-008-4975-3. [Online]. Available:
http://dx.doi.org/10.1007/s00339-008-4975-3
[77] S. Benderli and T. Wey, “On spice macromodelling of TiO2 memristors,” Elec. Lett.,
vol. 45, no. 7, pp. 377 –379, 26 2009.
[78] Z. Biolek, D. Biolek, and V. Biolkvoa, “Spice model of memristor with nonlinear
dopant drift,” Radioengineering Journal, vol. 30, no. 4, pp. 210–214, 2010.
82
[79] S. Shin, K. Kim, and S.-M. Kang, “Compact models for memristors based on charge-
flux constitutive relationships,” Computer-Aided Design of Integrated Circuits and
Systems, IEEE Transactions on, vol. 29, no. 4, pp. 590–598, Apr. 2010.
[80] A. Rak and G. Cserey, “Macromodeling of the memristor in spice,” Computer-Aided
Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 29, no. 4, pp.
632 –636, Apr. 2010.
[81] R. Pino, J. Bohl, N. McDonald, B. Wysocki, P. Rozwood, K. Campbell, A. Oblea, and
A. Timilsina, “Compact method for modeling and simulation of memristor devices:
Ion conductor chalcogenide-based memristor devices,” Jun. 2010, pp. 1 –4.
[82] T. Chang, S.-H. Jo, K.-H. Kim, P. Sheridan, S. Gaba, and W. Lu, “Synaptic behaviors
and modeling of a metal oxide memristive device,” Appl. Phys. A, Feb. 2011.
[83] J. G. Simmons, “Conduction in thin dielectric films,” J. Appl. Phys., vol. 34, pp.
2581–2590, 1963.
[84] R. M. Hill, “Electrical conduction in ultra thin metal films. i. theoretical,” in Proceed-
ings of the Royal Society of London. Series A, Mathematical and Physical Sciences,
vol. 309, no. 1498, Mar. 1969, pp. 377–395.
[85] J. Borghetti, “Electrical transport and thermometry of electroformed titanium dioxide
memristive switches,” J. Appl. Phys., vol. 106, no. 12, pp. 124 504–124 504–5, 2009.
[86] A. Chung, J. Deen, J.-S. Lee, and M. Meyyappan, “Nanoscale memory
devices,” Nanotechnology, vol. 21, no. 41, p. 412001, 2010. [Online]. Available:
http://stacks.iop.org/0957-4484/21/i=41/a=412001
[87] A. A. Zakhidov, B. Jung, J. D. Slinker, H. D. Abruna, and G. G. Malliaras, “A light-
emitting memristor,” Organic Electronics, vol. 11, no. 1, pp. 150–153, 2010.
[88] W. Robinett, G. Snider, D. Stewart, J. Straznicky, and R. Williams, “Demultiplexers
for nanoelectronics constructed from nonlinear tunneling resistors,” IEEE Transac-
tions on Nanotechnology, vol. 6, no. 3, pp. 280–290, May 2007.
[89] P. O. Vontobel, W. Robinett, P. J. Kuekes, D. R. Stewart, J. Straznicky, and R. S.
Williams, “Writing to and reading from a nano-scale crossbar memory based on mem-
ristors,” Nanotechnology, vol. 20, no. 42, pp. 425 204–425 223, 2009.
83
[90] H. Kim, M. Sah, C. Yang, and L. Chua, “Memristor-based multilevel memory,” in
Cellular Nanoscale Networks and Their Applications (CNNA), 2010 12th Interna-
tional Workshop on, Feb. 2010, pp. 1–6.
[91] C. E. Merkel, N. Nagpal, S. Mandalapu, and D. Kudithipudi, “Reconfigurable n-level
memristor memory design,” in International Joint Conference on Neural Networks,
2011.
[92] S. Remarsu and S. Kundu, “On process variation tolerant low cost thermal sensor
design in 32nm cmos technology,” in Proceedings of the 19th ACM Great Lakes
symposium on VLSI, ser. GLSVLSI ’09. New York, NY, USA: ACM, 2009, pp.
487–492. [Online]. Available: http://doi.acm.org/10.1145/1531542.1531653
[93] D. Burger and T. M. Austin, “The simplescalar tool set, version 2.0,” SIGARCH
Comput. Archit. News, vol. 25, pp. 13–25, Jun. 1997. [Online]. Available:
http://doi.acm.org/10.1145/268806.268810
[94] S. Mukherjee, P. Bannon, S. Lang, A. Spink, and D. Webb, “The alpha 21364 network
architecture,” in Hot Interconnects 9, 2001., 2001, pp. 113 –117.
[95] D. Brooks, V. Tiwari, and M. Martonosi, “Wattch: a framework for architectural-level
power analysis and optimizations,” SIGARCH Comput. Archit. News, vol. 28, pp.
83–94, May 2000. [Online]. Available: http://doi.acm.org/10.1145/342001.339657
[96] K. Skadron, M. R. Stan, K. Sankaranarayanan, W. Huang, S. Velusamy, and
D. Tarjan, “Temperature-aware microarchitecture: Modeling and implementation,”
ACM Trans. Archit. Code Optim., vol. 1, pp. 94–125, Mar. 2004. [Online]. Available:
http://doi.acm.org/10.1145/980152.980157
[97] SPEC-CPU2000, “Standard performance evaluation council, performance evaluation
in the new millennium, version 1.1,” 2000.
[98] K. Dellaquila, “Thermal profiling of homogeneous multi-core processors using sensor
mini-networks,” Master’s thesis, Rochester Institute of Technology, 2010.
84
