



A standard cell approach for MagnetoElastic NML circuits / D. Giri; M. Vacca; G. Causapruno; W. Rao; M. Graziano; M.
Zamboni. - STAMPA. - (2014), pp. 65-70. ((Intervento presentato al convegno Nanoscale Architectures (NANOARCH),
2014 IEEE/ACM International Symposium on tenutosi a Paris nel 2014-July-8-10.
Original







(Article begins on next page)
This article is made available under terms and conditions as specified in the  corresponding bibliographic description in
the repository
Availability:
This version is available at: 11583/2562939 since:
IEEE - INST ELECTRICAL ELECTRONICS ENGINEERS INC
A Standard Cell Approach for
MagnetoElastic NML Circuits
D. Giri∗†, M. Vacca∗, G. Causapruno∗, Wenjing Rao†, M. Graziano∗†, M. Zamboni∗
∗ Politecnico di Torino, Department of Electronics and Telecommunications, Corso Duca degli Abruzzi, 24, 10129 Torino, Italy
† University of Illinois at Chicago, Electrical and Computer Engineering Department, 851 S. Morgan St, Chicago (IL)
Email: {marco.vacca, giovanni.causapruno, mariagrazia.graziano, maurizio.zamboni}@polito.it, wenjing@uic.edu
Abstract—Among emerging technologies Quantum dot Cellu-
lar Automata (QCA) plays a fundamental role. Its magnetic ver-
sion, normally called NanoMagnet Logic (NML), is particularly
interesting thanks to the ability to work at room temperature and
to mix logic and memory in the same device. Magnetic circuits
have also a potential very low power consumption. Unfortunately
classic NML circuits are normally driven (clocked) with a current
generating a clocked magnetic field, nullifying the possibility to
actually obtain low power circuits.
We have recently developed a technology-friendly solution, the
MagnetoElastic NML (ME-NML), where magnetic circuits are
driven through an electric field, and not with a current, dras-
tically reducing the power consumption. In this paper we start
to explore the architectural consequences of this new magnetic
technology. The analysis is performed using as a benchmark a
Galois multiplier, a systolic architecture particularly suited for
QCA and NML technologies. The layout is precisely described
and the resulting circuit is modeled and simulated using VHDL
language. The obtained results are remarkable. The circuit area
is reduced by 4 times compared to classic NML approach. This,
coupled with the intrinsic lower power consumption due to
different clock, leads to a 50 times reduction of power absorption.
Moreover the particular structure of magnetoelastic NML allows
to define a library of standard cells that can be easily used by
designers and automatic layout tools to design circuits, greatly
improving future research in this field.
Index Terms—NanoMagnet Logic, Magnetoelastic Effect, Low
Power Circuits, Galois Field Multiplier
I. INTRODUCTION
Among emerging technologies Quantum dot Cellular Au-
tomata (QCA) [1] has drawn in recent years a considerable
amount of attention. Its magnetic implementation, NanoMag-
net Logic (NML) [2], is particularly interesting because it
offers unique features unavailable in current CMOS technol-
ogy. The basic unit is a single domain nanomagnet. Thanks
to its rectangular shape and sizes smaller than 100nm, only
two stable states are possible (Fig. 1.A) and they can be
used to represent logic values [2]. Since the basic cell is a
magnet, NML couples logic and memory in the same device
[3]. Moreover it is one of the few emerging technologies that
is feasible with current technological processes [2] and works
at room temperature. These unique features make NML one of

























Fig. 1. (A) Single domain nanomagnets are used to represent logic values.
(B) Circuits are divided in small areas called clock zones. At every clock zone
is applied one of many clock signals. Thanks to this mechanism in every time
step magnets of a clock zone switch according to magnets of a neighbor clock
zone which are in a stable state. (C) Multiphase clock system, 4 clock signals
are used in this case.
The distinctive characteristic of NML (and QCA) technol-
ogy is the necessity to use a clock mechanism to successfully
switch cells from one logic state to the other. Circuits are
created placing magnets on a plane, as shown in Fig. 1.B.
Theoretically information should propagate through the circuit
thanks to magnetic interaction among neighbor magnets, but
this interaction alone is not sufficient. Magnets must be forced
in an unstable state through an external mean, like a magnetic
field, lowering the barrier between the two stable states [4].
When the clock field is removed magnets are free to switch
according to the input element, propagating therefore the
information. Another limitation is that, due to thermal noise
[5], only a limited number of elements can be cascaded,
otherwise the error probability in information propagation
increases greatly. For this purpose a multiphase clock system
is adopted. Circuits are divided in areas, called clock zones,
including a limited number of magnets (Fig. 1.B). At every
clock zone different clock signals are applied. In [6] a 3-phases
clock system is adopted, while in Fig. 1.C a 4 phases system
is depicted. Signals are identical, just shifted of 90◦. Thanks to
the multiphase clock system, magnets of a clock zone switch
according to neighbor magnets that are in a stable (HOLD)
state. Magnets in the RESET state have no influence on signals
propagation.
Clock represents one of the most important drawbacks of
NML technology. Aside from the magnetic field clock [2],
other mechanism were developed, like a STT-current clock
[7]. Both these solutions use a current and therefore lead
to a high power consumption. Recently we have developed
an innovative solution based on an electric field instead of a
magnetic field, the magnetoelastic clock [8]. This solution is
similar to the one presented in [9] but is technology-friendly
and it allows to reach a very low power consumption also
considering all power losses in the clock generation network.
One of the positive side effects of our clock solution is that
it leads to the definition of a limited amount of possible
basic structures, defining therefore a set of Standard Cells.
This predefined set of cells can be easily used to design
circuits both with custom layout and using automated tools
[10] greatly enhancing the development of NML technology.
In this paper we propose a first analysis of the implications at
circuit layout level of the magnetoelastic clock. The analysis
is performed using as a benchmark a Galois multiplier, a
systolic architecture particularly suited for NML (and QCA)
technology. The results that we present here show that this
clock solution allows for much more compact layouts, greatly
reducing both circuits area and power consumption compared
to magnetic field based NML.
II. MAGNETOELASTIC CLOCK
If a magnetic field is used as a clock mechanism, a current
flowing through a wire placed under the magnets plane can be
employed to generate it. Fig. 2.A shows the clock generation
network. A wire is placed under the magnets plane. The
current flowing through this wire generates a magnetic field
parallel to the magnets short side, successfully forcing it in
the RESET state. A ferrite yoke surrounds the wire, providing
a better confinement of the magnetic flux lines. This clock
solution gives to circuits a peculiar structure, where clock
zones are made by parallel stripes (Fig. 2.B). Every stripe cor-
responds to the clock wire required to generate the magnetic
field. This clock zones layout has important consequences
on circuit architectures [11]. While this clock mechanism
was demonstrated both theoretically and experimentally [2]
its main drawback is the high power losses due to the Joule














Fig. 2. (A) Magnetic field clock. The magnetic field is generated by a current
flowing through a wire placed under the magnets plane. (B) Example of circuit
layout based on the magnetic field clock. AND/OR gates are used as basic
logic gates. (C) Magnetoelastic clock. Magnets are forced in the RESET state
by the application of a voltage to electrodes placed on both sides of the clock
zone. The correspondent electric field creates a mechanical deformation of the
piezoelectric substrate (PZT) changing therefore magnets state. (D) Example
of magnetoelastic NML circuit. Clock zones are obtained through mechanical
isolated islands.
To overcome this problem we proposed and studied [8]
[12] a clock mechanism based on an electric field instead of
a magnetic field. The general idea is depicted in Fig. 2.C.
Magnets are placed on a piezoelectric substrate. The material
chosen is PZT (Lead-Zirconate-Titanate), one of the best
piezoelectric materials available. On both sides of the magnets
two electrodes are used to generate the electric field when
a voltage is applied to them. The electric field induces a
strain in the piezoelectric substrate, and the correspondent
mechanical deformation of magnets induces a variation in the
magnetization thanks to the Magnetoelastic effect. Therefore
the application of an electric field effectively forces magnets
in the RESET state. However, since a voltage is used instead
of a current, clock losses are orders of magnitude smaller. The
structure is equivalent to a capacitor, and the only losses are
due to charging and discharging of the capacitor. Losses can
be evaluated as CV 2, but the value of capacitance (C) is lower
than 1 fF and the value of voltage (V ) is equal to few hundreds
of millivolt (more details are not reported for space reasons
but can be found in [8]). The energy losses are therefore very
small. An example of circuit layout is shown in Fig. 2.D. Every
clock zone is based on a mechanical isolated cell. Clock zones
sizes can vary between 3 and 5 magnets, depending on how
strict the requirements of the lithographic process are. This
because mechanical isolation is obtained through patterning
of the PZT with lithography. Communication among magnets
of clock zones is achieved through top and bottom borders,
since electrodes are placed on both sides of the zone. Logic
circuits are based on AND/OR gates as shown in [13].
III. STANDARD CELLS LIBRARY
The layout of MagnetoElastic NML (ME-NML) circuits is
based on mechanical isolated islands of limited size, each
one corresponding to a clock zone. This layout was chosen
according to the fabrication process limitations [8] but it has
an interesting consequence: The number of possible magnet
patterns inside a clock zone is reasonably small. Thanks to
this characteristic it is possible to define a library of Magnetic
Standard Cells, each one corresponding to a particular magnets
configuration inside a clock zone. Having defined a finite set
of all the conceivable magnet patterns within a clock zone,
any kind of circuit can be easily designed. The standard cells
library is shown in Fig. 3. This approach is also particularly
interesting in the perspective of a future ad hoc simulation and
synthesis tool for this technology [10].
A. Standard Cells
Cells height and width can vary between three and five
nanomagnets. The cell width and height must be chosen
according to the logic requirements and the fabrication process
limitations. A 3 × 3 layout is the most efficient because
it has the lowest critical pattern, i.e. the lowest number of
cascaded magnets between an input and an output. A smaller
number of magnets in the critical path leads to an higher
clock frequency and a lower error probability during signals
propagation. With 3× 3 cells, the electrodes width is equal to
40nm, a value compatible with the minimum width of metal-
1 wires in current CMOS technology [14]. The cell width and
height can be increased to five, to simplify the fabrication
process (larger cells and electrodes are easier to fabricate) but
at the cost of decreasing the clock frequency achievable and










'0' '1'"00" "01" "10" "11"
Fig. 3. Standard cell library elements with size of 3×3 magnets.
Fig. 3 shows the layout of all possible cell types included in
the standard cell library (3×3 case only is reported for space
reasons). Each table row corresponds to a different type of cell.
Cells are classified by type, for each type different orientations
are possible. This means that all cells of a specific type can
be obtained with an horizontal and/or vertical flip of the base
cell. A cell can represent either a logic gate or a simple wire.
The word wire, in the field of nanomagnetic logic, stands for
a series of horizontally or vertically adjacent magnets. Wires
can be single or double. “Double” means that two signals in
parallel are routed through the cell. Single wires can have
different lengths, depending if they connect an input and an
output on the same cell side or if they connect inputs and
outputs at opposite cell corners. Double wires have always
the same length.
A crosswire cell [2] is used when two wires must cross
each other without interference (NML at the time of writing
is still a planar technology). The library we created uses AND,
OR and INVERTER as logic gate set. The inverter is simply
implemented by an even number of nanomagnets horizontally
aligned, because an odd number results in no signal inversion.
AND/OR logic gates are obtained cutting one corner of a
magnet [13]. The different shape of those magnets gives them
a preferential state, which they will leave only when both
inputs, from above and below, are up or down, implementing
as a consequence an AND/OR logic function.
B. VHDL Model
To simulate circuits we developed a RTL (Register Transfer
Level) model written using VHDL language for each standard
cell [15]. The model helps to easily manage complexity and
hierarchy. The multiphase clock system gives to NML (and
QCA) circuits a peculiar behavior. In particular the propaga-
tion delay of a signal through a clock zone is equivalent to
the behavior of a D-Latch. As can be understood from Fig. 1
every clock zone samples a new data at every clock cycle. As a
consequence, every standard cell can be modeled by a register,
to emulate signals propagation, and an ideal logic gate without
delay, to represent the logic functions. This is true in case of
AND, OR and inverters [15]. Wires are modeled simply by
a D-Latch. Therefore the propagation delay of an NML wire
depends on the number of clock zones the wire passes through.
In other words, this kind of wire can be considered equivalent
to a pipelined interconnect in standard CMOS. We choose a
4-phase clock system for our design (as it will be explained
in Section IV) so a wire routed through N clock zones will



































Fig. 4. Hierarchical model for estimating number of magnets and clock zones,
occupied area and power dissipation. (A) Outline of inputs and outputs for
a single standard cell. (B) Area and power value of every clock zone are
added up by each Processing Element and finally from the multiplier entity,
obtaining the final results.
Each standard cell is represented by its correspondent
RTL model. Every type of standard cell is identified by
one only VHDL description. Various parameters are used to
differentiate every cell of the same type. Parameters used are
highlighted in Fig. 4: cell length and width, cell orientation,
clock phase and cell position in the circuit layout. The model
includes a hierarchical bottom-up evaluation of the occupied
area and power dissipation, described in Fig. 4 [15]. The
actual computation of area and power is at first performed
by the lowest layer: Each standard cell computes its own
area and power consumption. Then every processing element
(PE) of the Galois multiplier, the test circuit described in this
paper (see Section IV), computes its total area and power
consumption. A PE is at a higher hierarchical level than
standard cell, so it simply computes the total area and power
consumption as the sum of the area and power of every
standard cell. The total area and power of the whole Galois
multiplier is then computed in the same way as the sum of
the total area and power of each PE.
Every standard cell evaluates the total number of magnets
starting from the height and length (in terms of magnets)
received as input parameters. The occupied area is calculated
multiplying the physical cell length and width, considering
also the separation space among magnets and the area oc-
cupied by electrodes. In case of ME-NML circuits, magnets
are 50x65nm2, the separation space considered is 20nm.
Electrodes are 40nm width in case of 3× 3 cells, 70nm with
bigger cells.
Power losses in NML circuits depend on two main compo-
nents: Power dissipated by nanomagnets during their switching
phase, and power loss in the clock generation network. The
switching power consumption, required to force magnets in
the reset state, is equivalent to the height of the energy barrier
between stable and reset state multiplied by the number of
magnets and the switching frequency. This is true because in
ME-NML, unlike Magnetic NML, adiabatic switching is not
used, to achieve maximum clock frequency. Indeed, adiabatic
switching allows to reduce the switching power consumption
at the cost of reduced clock frequency. The energy barrier
value is around 180×kBT . Every clock zone, together with its
two electrodes behaves as a capacitor, the clock consumption
for one cell corresponds to the energy needed for charging the
electrodes capacitance (CV 2). The value of capacitance (C)
and voltage (V ) is calculated starting from the cell sizes and
the materials selected.
IV. GALOIS FIELD MULTIPLIER
To verify and validate the proposed approach we have used
as case of study a Galois Field Multiplier (GFM). It is a highly
scalable and regular architecture that has many applications in
coding theory, computer algebra and cryptography. A Galois
Field GF (q) is a field consisting of a finite number of elements
(q elements) together with the description of two operations
(addition and multiplication) that can be performed on pair of
elements. A unique Galois Field exists only for any q = pm
where p is a prime number and q a positive integer.
Binary Galois Field GF (2m) can be very efficiently imple-
mented with VLSI gates. GF (21) is the smallest possible field,
it contains only the elements 0 and 1 and the two operations
are performed modulo 2. Addition is obtained with a logical
XOR, while the multiplication with a logical AND. When the
value of m in the binary field is greater than 1, ordinary
modulo operations do not apply. Each element of the field
can be uniquely represented with a polynomial of degree up
to m−1 with coefficients in GF (2). The following algorithm
illustrates how to multiply two polynomials a(x) and b(x),
belonging to GF (2m), modulo an irreducible polynomial p(t)
of degree m.
r (t ) := 0
for i = m−1 downto 0 do
r (t ) := t*r (t ) + a_i*b (t )
if degree (r (t ) ) = m then r (t ) := r (t )−p (t )
return r (t )
The circuit schematic of the Galois Field Multiplier, for the
case GF (24), is shown in Fig. 5. Addition and multiplication
symbols, inscribed in circles, correspond respectively to XOR
and AND ports. The detail on the right side of Fig. 5 shows the
implementation of the combinational logic using only AND,
OR and INVERTER gates, which are the only ones available
in our NML standard cell library. The serial input (dataA)
and the feedback enabling the summation with the primitive
polynomial represent critical paths, their length increases
proportionally to the field degree. The pipeline is employed
to break those paths reducing the multiplication time delay
at the price of an increase of circuit area. Using a pipelined















Fig. 5. Circuit schematic of a fully pipelined bit-serial Galois Field Multiplier.
On the right it is possible to observe the detail of the combinational logic of
one processing element.
path will be the same, but the serial bits of dataA must be
now fed one every two clock cycles.
As mentioned in Section III, NML circuits are intrinsically
pipelined, and every consecutive 4 clock zones (assuming a 4
phase clock) signals acquire a propagation delay of 1 clock
cycle. It is therefore important to use regular architectures like
systolic arrays to avoid long interconnections and maximize
performance. Systolic arrays are architectures composed of
identical processing elements with a highly regular layout. The
Galois Field Multiplier is one of those systolic architectures
and is therefore highly suitable for NML technology. From
Fig. 5 it is possible to identify each Processing Element (PEs).
Beside the first and the last, which are slightly different, all the
others are identical. Since this is valid for any parallelism of
the multiplier, only three different processing elements need to
be designed. Therefore a Galois Multiplier with any number
of bits can be designed simply combining the first PE, the
desired number of central PEs and the last PE.
A. Magnetoelastic GFM
Fig. 6 shows the circuit layout of a 4 bit GFM implemented
with ME-NML technology. Different colors identify different
clock zones. We have chosen a 4-phases clock system because
it leads to a more regular layout with respect to a 3-phases
clock. Inside every clock zone electrodes are not depicted
for sake of clarity of the picture. In Fig. 6 signal patterns
are highlighted with arrows. Every clock zone corresponds to
one of the standard cells in Fig. 3. The central processing
elements are identical, while the first and the last processing
elements are slightly different. The result is an extremely
compact and regular circuit layout. The GFM is also perfectly
scalable, because adding more bits means to add more central
processing elements which are all equal.
Due to the intrinsic circuit pipelining a new input can be
given to signal dataA every 6 clock cycles. As a consequence a
multiplication can be completed in 6N clock cycles, where N
is the multiplier number of bits. To improve data throughput,
signals interleaving can be adopted [3]. Six multiplications
must be executed in parallel, so at every clock cycle a new data
from a different multiplication must be fed to the circuit. In
this way the throughput can be improved by 6 times. Moreover
'0'
dataB(0) dataB(1) dataB(2) dataB(3)
P(0)
Res(0) Res(1) Res(2) Res(3)
dataA
(serial)
rst P(1) rst P(2) rst P(3) rst
Fig. 6. NML implementation of a 4-bit serial Galois Field Multiplier based on magnetoelastic NML circuits. Three types of processing element can be
identified, the first and the last are slightly different from the central PE. Clock zones electrodes are not depicted for image clarity.
to reach the highest possible throughput, the PE input and the
feedback of the last PE have to be reset to zero whenever
the first bit of a new operation arrives. A reset signal (rst) was
therefore routed to the circuit and synchronized with incoming
input signals.
B. Magnetic Clock GFM
To compare the layout obtained with the magnetoelastic
clock, we designed the Galois Field Multiplier using the
classic magnetic field clock. The layout of the 2 bits version is
depicted in Fig. 7. The 4 bits multiplier is not shown because
the schematic of the 2 bits version is easier to understand.
Since the particular structure of the GFM requires feedback
signals, a more complex structure is required with respect
to the simple layout of Fig. 2.B. It is called snake clock
and is thoroughly described in [6]. Clock phases are 3 and
clock wires are alternatively placed above and under magnets
plane. Placing clock wires above and under the plane was later
suggested also in [2]. In NML for a signal to propagate in a
particular direction, clock zones must be crossed in the right
order from 1 to 3. With the layout shown in Fig. 2.B signals
can move only from left to right. To enable feedbacks and
allows signals propagation also from right to left, phases 2 and
3 must be swapped. To permit this swap, the corresponding
clock wires must be twisted. The correct order of clock phases
to enable signals propagation in both directions is shown in
Fig. 7, where the area represented by an X corresponds to the
area where clock wires are twisted. Magnets cannot be placed
in that area. More details on the snake-clock scheme can be
found in [6].
Just like the implementation with the magnetoelastic clock,
the GFM can be assembled using three different PEs only.
To simulate and analyze this version of the GFM we have
implemented also in this case a RTL model described with
VHDL. Details on the model can be found in [15]. For sake
of clarity we briefly report here how the area and power are
evaluated in this model. The area is the rectangle circum-
scribed to the circuit. Power consumption is instead given
by two components: Power dissipated by nanomagnets during
their switching phase, and power dissipated by clock wires
thanks to the Joule effect. The value of 30kBT is chosen as
average energy dissipated by the a single nanomagnet during
the switching phase, since an adiabatic clock is used in this
case [2]. The power consumption due to magnets switching
is simply obtained multiplying this value of energy for the
number of magnets and the frequency. The main contribution
to the power consumption is however the dissipation due to
Joule effect. A high value of current is necessary to generate
a magnetic field strong enough to force a reset. This power
component is simply evaluated as the power dissipation of a
3mA current [2] flowing in a copper wire long as all the clock
zones put together and with a section wide as a clock zone
and 400nm high.
As it will be clear from the results present in Section V, the
area of this version of the GFM is bigger than the area of the
one implemented with the magnetoelastic clock. This increases
also the circuit latency so a new data must be fed to the circuit
every 10 clock cycle instead of 6. Similarly 10 multiplications
must be interleaved instead of 6 to reach maximum throughput.
V. RESULTS
Performances of the two implementations, in terms of
throughput, area and power are put now side by side. Area
and power consumption are summarized in Table I, varying
the number of bits from 4 to 32.
As discussed in Section IV, the latency, i.e. the number
of clock cycles between one input and another is 6 in case
of magnetoelastic clock and 10 in case of snake clock. As
a consequence the throughput in case of the magnetoelastic
clock (supposing to use the same clock frequency of 100MHz
in both cases) is around 30% higher. Using data interleaving
the throughput is maximized and it is equal for both cases,
but for the magnetoelastic clock only 6 operations instead of
10 must be executed in parallel.
Fig. 7. NML implementation of a 2-bit serial Galois Field Multiplier based on a magnetic field and the snake-clock mechanism. The 2 bits version was
chosen instead of the 4 bit GFM for sake of picture clarity.
TABLE I
AREA AND POWER COMPARISON BETWEEN THE TWO GFM
IMPLEMENTATIONS WITH A VARIABLE NUMBER OF BITS.
N of bits 4 8 16 32
AREA Magnetoelastic 14.07 28.63 57.76 116.03
(µm2) Snake 56.60 107.29 208.67 411.42
Magnets Magnetoelastic 0.072 0.148 0.299 0.602
Switching Snake 0.023 0.046 0.092 0.184
POWER Clock Magnetoelastic 1.196 2.435 4.913 9.868
(µW) Generation Snake 69.65 132.02 256.76 506.24
Total Magnetoelastic 1.27 2.58 5.21 10.47
Power Snake 69.67 132.06 256.85 506.42
Area of the magnetoelastic GFM results to be four times
lower than the snake-clock GFM. The reasons are twofold.
Nanomagnets have different sizes: 50×65nm2 for magnetoe-
lastic GFM, 60×90nm2 for the snake-clock implementation.
Moreover magnetoelastic layout is intrinsically more compact
and with almost no wasted space, while in the snake-clock
case there are many clock zones regions without magnets due
to clock constraints. Regarding power consumption the gap
between magnetoelastic and snake clock is much wider. The
intrinsic power consumption due to magnet switching is higher
for the magnetoelastic case but the biggest source of power
dissipation are the losses in the clock generation network.
As it can be seen from Table I, clock losses in the snake
clock case are extremely high, and very small in case of the
magnetoelastic clock. So, putting together the much smaller
clock losses with the reduced area, the power consumption
becomes 50 times lower in case of magnetoelastic clock, which
is a remarkable result.
VI. CONCLUSIONS
We have presented a detailed analysis of NML circuits based
on magnetoelastic clock. A set of standard cells, covering all
possible clock zones configurations, was developed and used
to create the complete layout of an N-bit Galois multiplier.
The circuit was modeled and then simulated using a RTL-level
model written in VHDL. A power analyzer was embedded
inside the model allowing to evaluate exactly the circuit area
and power consumption. Results show that the magnetoelastic
clock allows to reduce the circuit area of 4 times and the total
power consumption of 50 times.
As a future work we will continue to investigate the layout
of circuits based on this clock solution, which greatly enhance
NML technology. We are also conducting a detailed material
analysis to further reduce power consumption.
REFERENCES
[1] C. Lent, P. Tougaw, W. Porod, and G. Bernstein, “Quantum cellular
automata,” Nanotechnology, vol. 4, pp. 49–57, 1993.
[2] M. Niemier and al., “Nanomagnet logic: progress toward system-level
integration,” J. Phys.: Condens. Matter, vol. 23, p. 34, Nov. 2011.
[3] M. Vacca, M. Jiang, J. Wang, F. Cairo, G. Causapruno, G. Urgese,
A. Biroli, and M. Zamboni, “NanoMagnet Logic: an Architectural Level
Overview ,” Anderson, N.G., Bhanja, S. (eds.), Field-Coupled Nanocom-
puting. LNCS. Springer, Heidelberg, vol. 8280, 2014 (forthcoming).
[4] M. Vacca, M. Graziano, and M. Zamboni, “Majority Voter Full Charac-
terization for Nanomagnet Logic Circuits,” IEEE T. on Nanotechnology,
vol. 11, no. 5, pp. 940–947, Sep. 2012.
[5] G. Csaba and W. Porod, “Behavior of Nanomagnet Logic in the Pres-
ence of Thermal Noise,” in International Workshop on Computational
Electronics. Pisa, Italy: IEEE, 2010, pp. 1–4.
[6] M. Graziano, M. Vacca, A. Chiolerio, and M. Zamboni, “A NCL-HDL
Snake-Clock Based Magnetic QCA Architecture,” IEEE Transaction on
Nanotechnology, vol. 10, no. 5, pp. 1141–1149, Sep. 2011.
[7] J. Das, S. Alam, and S. Bhanja, “Low Power Magnetic Quantum Cellular
Automata Realization Using Magnetic Multi-Layer Structures,” J. on
Emerging and Selected Topics in Circuits and Systems, vol. 1, no. 3, pp.
267–276, Sep. 2011.
[8] M. Vacca, M. Graziano, A. Chiolerio, A. Lamberti, M. Laurenti,
D. Balma, E. Enrico, F. Celegato, P. Tiberto, and M. Zamboni, “Electric
clock for NanoMagnet Logic Circuits ,” Anderson, N.G., Bhanja, S.
(eds.), Field-Coupled Nanocomputing. LNCS. Springer, Heidelberg, vol.
vol. 8280, 2014 (forthcoming).
[9] M. S. Fashami, J. Atulasimha, and S. Bandyopadhyay, “Magnetization
Dynamics, Throughput and Energy Dissipation in a Universal Multifer-
roic Nanomagnetic Logic Gate with Fan-in and Fan-out,” Nanotechnol-
ogy, vol. 23, no. 10, Feb. 2012.
[10] S. Frache, D. Chiabrando, M. Graziano, M. Graziano, L. Boarino, and
M. Zamboni, “Enabling Design and Simulation of Massive Parallel
Nanoarchitectures,” Journal of Parallel and Distributed Computing, vol.
In Press, Aug. 2013.
[11] M. Awais, M. Vacca, M. Graziano, and G. Masera, “Quantum dot
Cellular Automata Check Node Implementation for LDPC Decoders,”
IEEE Transaction on Nanotechnology, vol. 12, no. 3, pp. 368–377, 2013.
[12] M. Vacca, M. Graziano, L. D. Crescenzo, A. Chiolerio, A. Lamberti,
D. Balma, G. Canavese, F. Celegato, E. Enrico, P. Tiberto, L. Boarino,
and M. Zamboni, “Magnetoelastic clock system for nanomagnet logic,”
IEEE Trans. on Nanotechnology, In publishing 2014.
[13] M. Niemier, E. Varga, G. Bernstein, W. Porod, M. Alam, A. Dingler,
A. Orlov, and X. Hu, “Shape Engineering for Controlled Switching With
Nanomagnet Logic,” IEEE Trans. on Nanotechnology, vol. 11, no. 2, pp.
220–230, Mar. 2012.
[14] “International Technology Roadmap of Semiconductors,” 2012,
http://public.itrs.net.
[15] M. Vacca, M. Graziano, and M. Zamboni, “Nanomagnetic Logic Micro-
processor: Hierarchical Power Model,” IEEE Trans. on VLSI Systems,
vol. 21, no. 8, pp. 1410–1420, Aug. 2012.
