A Multi-Gbps Unrolled Hardware List Decoder for a Systematic Polar Code by Giard, Pascal et al.
ar
X
iv
:1
70
2.
00
93
8v
1 
 [c
s.A
R]
  3
 Fe
b 2
01
7
A Multi-Gbps Unrolled Hardware List Decoder
for a Systematic Polar Code
Pascal Giard∗†, Alexios Balatsoukas-Stimming†, Thomas Christoph Mu¨ller†,
Andreas Burg†, Claude Thibeault‡, and Warren J. Gross∗
∗Department of Electrical and Computer Engineering, McGill University, Montre´al, Que´bec, Canada.
Email: pascal.giard@mail.mcgill.ca, warren.gross@mcgill.ca
†Telecommunications Circuits Laboratory, ´Ecole polytechnique fe´de´rale de Lausanne, Lausanne, Switzerland.
Email: {pascal.giard,alexios.balatsoukas,christoph.mueller,andreas.burg}@epfl.ch
‡Department of Electrical Engineering, ´Ecole de technologie supe´rieure, Montre´al, Que´bec, Canada.
Email: claude.thibeault@etsmtl.ca
Abstract—Polar codes are a new class of block codes with
an explicit construction that provably achieve the capacity of
various communications channels, even with the low-complexity
successive-cancellation (SC) decoding algorithm. Yet, the more
complex successive-cancellation list (SCL) decoding algorithm
is gathering more attention lately as it significantly improves
the error-correction performance of short- to moderate-length
polar codes, especially when they are concatenated with a cyclic
redundancy check code. However, as SCL decoding explores
several decoding paths, existing hardware implementations tend
to be significantly slower than SC-based decoders. In this paper,
we show how the unrolling technique, which has already been
used in the context of SC decoding, can be adapted to SCL
decoding yielding a multi-Gbps SCL-based polar decoder with an
error-correction performance that is competitive when compared
to an LDPC code of similar length and rate. Post-place-and-
route ASIC results for 28 nm CMOS are provided showing that
this decoder can sustain a throughput greater than 10 Gbps at
468 MHz with an energy efficiency of 7.25 pJ/bit.
I. Introduction
Polar codes were recently selected for the next-generation
mobile communications standard that is currently under de-
velopment by the 3GPP due to their excellent error-correction
performance at short to moderate blocklengths under SCL
decoding [1, p. 123]. Unfortunately, most SCL decoder im-
plementations in the literature still suffer from low throughput
and high decoding latency [2–5]. Several algorithmic and
architectural improvements have been proposed in order to
remedy this situation. For example, multi-bit SCL decoding [3]
can significantly reduce the decoding latency of SCL decoding,
but at a very large cost in terms of the required hardware
resources. A similar approach, which groups multiple bits into
symbols and transforms the SCL decoder to a symbol-based
SCL decoder was presented in [6] and is shown to offer sim-
ilar decoding throughput improvements compared to standard
multi-bit SCL decoding, but with lower decoding complexity.
A different approach was taken in [7] where the proposed Fast-
SSC-List decoding algorithm employs specialized decoding
units for smaller sub-codes of the polar code in order to reduce
the decoding latency.
Unrolled decoders are known for their tremendous through-
put [8–11]. They offer at least one order of magnitude im-
provement in throughput with respect to standard decoders at
the cost of larger area requirements. While this unrolling tech-
nique has been applied to SC-based polar decoders before [9,
11], it has not yet been applied to a hardware successive-
cancellation list (SCL)-based decoder. Applying the technique
to the original SCL decoding algorithm [12] would result in
a hardware implementation with very high area complexity.
Thus, in this paper, we propose an unrolled hardware im-
plementation of the Fast-SSC-List decoding algorithm [7].
We show that, for a (512, 427) systematic polar code, the
throughput is an order of magnitude higher than the state of
the art, while the error-correction performance is better than
that of the (576, 480) low-density parity-check (LDPC) code
from the IEEE 802.16e standard [13].
Outline: The remainder of this paper starts with Section II
by providing the necessary background, consisting of a brief
review of polar codes and an introduction to SCL-based
decoding algorithms. Moreover, we present a comparison of
the error-correction performance of an SCL-decoded polar
code against that of an LDPC code from the IEEE 802.16e
standard. Section II also briefly reviews the Fast-SSC-List
decoding algorithm. Section III describes our adaptation of
the fully-unrolled and pipelined hardware architecture to SCL
decoding. Section IV discusses implementation details and
provides post-place-and-route (PAR) ASIC area, timing, and
power results for the 28 nm UTBB-FD-SOI CMOS technology
from ST Microelectronics. A comparison against the state-of-
the-art SCL-based decoder implementations from the literature
is also carried out in Section IV. Finally, Section V concludes
this paper.
II. Background
A. Polar Codes
In his original work, Arıkan used a linear transformation of
a vector of bits that can be shown to lead to a polarization
phenomenon, meaning that some of these bits experience al-
most noiseleses transmission channels while the remaining bits
experience almost completely noisy transmission channels.
Polar codes exploit this polarization phenomenon to achieve
the symmetric capacity of memoryless channels as the code
length goes to infinity. More specifically, to construct an (N,
k) polar code, the N − k least reliable bits (i.e., the bits that
experience the N − k worst transmission channels), called the
frozen bits, are set to zero and the remaining k bits are used
to carry actual information.
Polar codes provably achieve capacity when decoded us-
ing the low-complexity successive-cancellation (SC) algo-
rithm [14]. However, with SC decoding, the error-correction
performance of polar codes at short to moderate length is in
general worse than the error-correction performance of other
modern channel codes. It was shown that decoding polar codes
using an SCL-based decoding algorithm significantly improves
the situation [12], especially when concatenating the polar
code with a cyclic redundancy check (CRC) [12, 15].
It was shown in [16] that polar codes can be encoded and
decoded systematically, leading to an improved bit-error rate
(BER) without affecting the frame-error rate (FER). In this
work, systematic polar codes are used.
B. Successive-Cancellation List Decoding
The SC decoding algorithm is a greedy algorithm: it uses
the channel output yN−10 and the previous bit estimates uˆ
i−1
0
to estimate the value of bit uˆi. Therefore, as soon as an error
occurs a frame will inevitably be in error as past decisions
are never revisited. SCL-based decoding algorithms for polar
codes are also greedy in the sense that they sequentially build
the most likely codewords. However, at each step, instead
of considering only the most likely bit value, both possible
bit values—0 and 1—are considered. Thus, as the decoding
proceeds a constrained list of up to L potential candidate
codewords is built, and a reliability metric is calculated for
each path along the way. At the very end, the most likely
codeword among the candidates in the list is selected. In the
case of the CRC-aided SCL (CA-SCL) decoding algorithm
the polar code is concatenated with a CRC and when decoding
ends the CRC is calculated for all L candidate codewords. The
most likely codeword with a calculated CRC that matches the
expected CRC is selected as the estimated codeword. If none
of the CRCs matches, the codeword with the best reliability
is selected.
Fig. 1 shows the error-correction performance of a
(512, 427) systematic polar code decoded using the SCL and
the CA-SCL algorithm with L = 2. An 8-bit CRC is used
for the CA-SCL decoding algorithm. The performance is
simulated for random codewords, using a binary phase-shift
keying (BPSK) modulation over an additive white Gaussian
noise (AWGN) channel. Both FER and BER of the (576, 480)
LDPC code from the IEEE 802.16e standard [13] are included
for comparison. The LDPC code is decoded with a layered
schedule using the self-corrected min-sum algorithm [17]. It
can be seen that the chosen polar code compares favorably
against an LDPC code of similar length and rate even with
a list size as small as L = 2. It should also be noted that,
in this particular scenario, for a targeted FER of 10−3, it is
beneficial to not concatenate the polar code with a CRC. This
observation was also made in [2] i.e., in some cases, it is
3 3.5 4 4.5 5
10−4
10−3
10−2
10−1
100
Eb/N0 (dB)
FE
R
3 3.5 4 4.5 5
10−5
10−4
10−3
10−2
Eb/N0 (dB)
B
ER
List: L = 2
List-CRC: L = 2
LDPC: I = 5 I = 10 I = 15
Fig. 1: Error-correction performance of SCL and CA-SCL
decoding of a (512, 427) systematic polar code versus that
of the (576, 480) IEEE 802.16e LDPC code. L is the maxi-
mum number of candidate codewords for the SCL decoding
algorithm and I is the maximum number of iterations for the
self-corrected min-sum algorithm.
more beneficial—error-rate wise—to not concatenate a polar
code with a CRC.
Unfortunately, SCL decoding involves the exploration of
multiple decoding paths simultaneously as well as a costly
path metric sorting step. Thus, hardware implementations of
SCL decoding are typically much slower than state-of-the-
art SC-based decoders. As mentioned in the introduction,
multiple algorithms employing multi-bit decisions have been
proposed [3–7] to significantly reduce the decoding latency
and increase the decoding throughput of SCL decoding. The
hardware implementation presented in this paper is based on
the Fast-SSC-List decoding algorithm proposed in [7].
C. Fast-SSC-List Decoding
SCL decoding uses the SC decoding algorithm, a sequential
algorithm proceeding bit by bit, which effectively limits the
achievable speed of hardware implementations. Recognizing
that a polar code is the concatenation of smaller constituent
codes, it was shown in [18] that many constituent codes
could be more efficiently estimated with dedicated decoders
compared to using the processing elements implementing the
SC algorithm. This led to the Fast-SSC decoding algorithm,
where multiple bits are estimated simultaneously. In [7], it was
proposed to adapt the Fast-SSC algorithm to SCL decoding.
Fast-SSC-List decoding provides algorithms for four differ-
ent constituent codes: Rate-1, Rate-0, Repetition, and single-
parity check (SPC). Each algorithm consists of two parts. The
first part is the candidate generation, it consists of creating the
L most likely bit estimate vectors β . The second part consists
of computing the corresponding path reliability metrics PM.
α70 αc α2
α3
F
G0
G1
α1
α2
α3
Rep PM0
ℓ0
PM0
ℓ0
SPC
0
SPC
1
PM00
PM01
PM1
ℓ1
PM2
ℓ2
&0
&1
ℓ00
ℓ01
L-
B
es
t
Ca
n
di
da
te
s
PM3
ℓ3
Co
m
bi
n
e
PM4
ℓ4 B
es
t
Ca
n
di
da
te
βc β
7
0
Fig. 2: Fully-unrolled deeply-pipelined Fast-SSC-List decoder for a (8, 4) polar code with L = 2. Constituent decoders for a
Repetition code and for SPC codes are shown in green and orange, respectively. Clock signals omitted for clarity.
Note that for a Rate-0 constituent code, only one path reliabil-
ity metric is computed as there is only one possible candidate
estimated bit vector, the all-zero vector.
While the proposed algorithms for the Rate-0 and Repetition
codes are exact, approximations are used for Rate-1 and SPC
codes. Thus, although it can be kept small, there is some
coding loss inherent to the Fast-SSC-List decoding algorithm
compared to the SCL algorithm.
III. Unrolled and Pipelined SCL Decoder Architecture
As we have already mentioned, unrolled decoder architec-
tures provide extremely high decoding speeds. In an unrolled
decoder architecture, each and every operation required is
instantiated in hardware so that data can flow through the
decoder with minimal control. An unrolled and pipelined
architecture for SC-based polar decoding was first described
in [9], and later improved and generalized in [11]. In this
section, we explain how the unrolled fast-SC decoder archi-
tecture of [11] can be extended to Fast-SSC-List decoding [7]
by means of a small example.
Fig. 2 shows an implementation example of a fully-unrolled
deeply-pipelined Fast-SSC-List-based decoder for an (8, 4)
polar code with L = 2. The F and G blocks, generating the
soft-input log-likelihood ratios (LLRs) to the constituent de-
coders, implement the same functions as in the SC algorithms
using the min-sum approximation [19]. The Combine block
corresponds to one stage of a polar encoder. The “&” blocks
are bit-vector joining operators, and registers are shown in
light gray, a Repetition node in green and SPC nodes in orange.
Each register denoted ℓ is used to store one of the paths
that survived, expressed as a partial sum. The ℓ-registers at
the ouput of a concatenation block “&” have the surviving
paths concatenated with the L new bit estimate vectors β
coming out of the consituent decoders. Thus, at the output
of a sorting step—denoted L-Best Candidate in Fig. 2—a
register ℓ contains the L paths that survived, also expressed
as a partial sum. Thus a Combine block updates the partial
sum by operating on the right hand side bits of an ℓ-register.
In this example, the G functions are calculated preemptively
as a Repetition node allows for only 2 possible outcomes
and as the first path fork occurs there, the generated paths at
the output of the Repetition node will necessarily be retained
among the L = 2 best candidates. For the general case however,
this cannot be applied as the newly estimated paths (or partial
sums) may be combined with different path sources before
entering a G function.
While not illustrated in Fig. 2, a figure depicting a very
small polar code, as soon as LLRs α need to be retained
past a sorting step, multiplexers have to be inserted after each
sorting block. Those multiplexers allow the decoder to select
the LLR values corresponding to the surviving path sources.
IV. Implementation and Results
For this paper, we have implemented a fully-unrolled
partially-pipelined Fast-SSC-List decoder with an initiation
interval I = 20 and a list size L = 2. An initiation interval
I = 20 means that a new frame is fed into the decoder every
20 clock cycles. It also means that a new estimated codeword
is available at the output of the decoder every 20 clock cycles.
The Rate-0, Repetition, and SPC nodes were constrained to a
maximum length of 8, 8, and 4, respectively. The information
(i.e., rate-1) nodes were not constrained and as a result, the
largest one has a length of 128. The SPC node is pipelined
over 2 clock cycles to shorten its longest path.
The critical path of the decoder is the sorting block—
denoted “L-Best Candidates” in Fig. 2—; it starts from the
output of a register storing a path metric, then through 2 levels
of 7-bit comparators and ends into another register storing one
the 2 best path metrics. The clock frequency was selected to
obtain an information throughput of 10 Gbps.
A. Methodology
ASIC post-PAR results are for the 28 nm UTBB-FD-SOI
CMOS technology from ST Microelectronics using the RVT
standard-cell library. Synopsys Design Compiler and Cadence
Innovus were used for synthesis and PAR, respectively. Clock
3 3.5 4 4.5 5
10−3
10−2
10−1
Eb/N0 (dB)
FE
R
3 3.5 4 4.5 5
10−5
10−4
10−3
10−2
Eb/N0 (dB)
B
ER
Floating point:
Qi.Qc.Q f : 7.6.1 6.5.1 6.5.0 5.4.0
Fig. 3: Effect of quantization on the error-correction per-
formance of Fast-SSC-List-based decoding of a (512, 427)
systematic polar code with list size L = 2.
gating was used in order to reduce power consumption. The
post-PAR netlist has been simulated for verification as well
as for the generation of vectors. These latter ones have then
been used to annotate the design data with toggle rates that
correspond to steady state operation—i.e. a filled pipeline—
in order to extract a meaningful estimation of the average
power consumption of 250 random frames. In our results, the
decoding latency includes the time required to load the channel
LLRs, decode a frame and output the best estimated codeword.
B. Impact of Quantization
All LLRs and path metrics are expressed using the two’s
complement representation. The LLR value quantization is
denoted as Qi.Qc.Q f , where Qc is the total number of bits used
to store a channel LLR, Qi is the total the number of bits used
to store internal LLRs and Q f is the number of fractional bits
in both types of LLRs. The number of bits used to represent
a path metric is Qi + 1, and path metrics are normalized after
each sorting step in order to avoid overflows.
Fig. 3 shows the effect of quantization on the error-
correction performance of List-based decoding of a (512, 427)
systematic polar code with list size L = 2. It can be seen that
the coding loss at a FER of 10−3 or a BER of 10−5 can be kept
under 0.25 dB with many different configurations. Notably,
with Qi.Qc.Q f = 7.6.1 and 6.5.0, the coding loss at a BER of
10−5 is approximatively of 0.10 dB and 0.15 dB, respectively.
Thus, the proposed implementation uses Qi.Qc.Q f = 6.5.0 and
the path metrics are represented using 7 bits.
C. Comparison with the State of the Art
Table I shows the post-PAR results along with the power
consumption estimations for our Fast-SSC-List unrolled de-
coder implementation. Unfortunately, we could not find imple-
mentation results of SCL-based decoders for a frame length
N = 512 in the literature. Thus, the state-of-the-art works
TABLE I: Comparison with state-of-the-art List-based polar
decoders. Technology-scaled area results for 28 nm CMOS
are included at the bottom.
Implementation this work⋄ [3]⋆ [4]⋆ [5]⋆
Code Length 512 1,024 1,024 1,024
Rate 0.83 0.5 0.5 0.5
List Size 2 2 2 2
Algorithm Approx. Exact Approx. Approx.
Technology 28 nm 65 nm 90 nm 90 nm
Area (mm2) 0.87 1.06 1.98 2.32
Supply (V) 1.1 N/A N/A N/A
Frequency (MHz) 468 500 423 409
Latency (µs) 0.54 2.04 0.79 0.87
Coded T/P (Gbps) 12.0 0.5 1.3 2.4
Area Eff. (Gbps/mm2) 13.79 0.47 0.67 1.04
Power (mW) 87 395 N/A N/A
Energy Eff. (pJ/bit) 7.25 790 N/A N/A
Normalized results for 28 nm
Area (mm2) 0.87 0.20 0.19 0.22
Area Eff. (Gbps/mm2) 13.79 2.54 6.95 10.76
⋄Post-layout results, 80% utilization and timing is met.
⋆Synthesis results.
included in Table I for comparison are decoders for the closest
frame length i.e. N = 1024. Their list size, however, is
identical. It should also be noted that the other works only
present synthesis results and their post-PAR results would most
likely be slightly worse both in terms of area and in terms
of the operating frequency. Since our decoder is for a polar
code of higher rate than the other works, we list the coded
throughput for fair comparison. Similarly, we also present
technology-scaled results, using Dennard scaling laws [20],
for fair comparison.
From Table I, it can be seen that the coded throughput of
the proposed decoder is from 5 to 21 times higher than that
of the other works. Latency is from 32% to 74% lower than
the other decoders. As the timing constraint was easily met,
the clock frequency could be increased to improve throughput
and latency at the cost of power consumption. The area of the
proposed decoder however is approximately 4 times higher
than the normalized area of the L = 2 List decoders of [3–5]
for N = 1024. Nonetheless, the post-PAR area efficiency of
our decoder is 1.3 to 5.4 times greater than the normalized
post-synthesis area-efficiency results of the other works. This
efficiency comes at the cost of reduced flexibility: the proposed
decoder only supports one specific polar code i.e., the code
length or its frozen bit locations cannot be modified at run
time. However, the multi-mode idea for unrolled decoders
described in [11] is also applicable to our proposed SCL-
based decoder and support for a few more polar codes could be
added. Lastly, the power consumption of our proposed decoder
is estimated to be of 87 mW, leading to an energy efficiency
of 7.25 pJ/bit.
V. Conclusion
In this paper, we proposed a List-based decoder hardware
implementation for a systematic polar code with better error-
correction performance than an LDPC code of similar length
and rate from the 802.16e standard. Post-PAR ASIC results
for the 28 nm UTBB-FD-SOI CMOS technology from ST
Microelectronics demonstrated that the proposed decoder is
capable of sustaining a throughput greater than 10 Gbps with
an energy efficiency of 7.25 pJ/bit at a clock frequency of
468 MHz. These results show excellent energy efficiency at the
cost of increased area with respect to existing implementations.
Yet, the post-PAR area efficiency was shown to be 28% better
than the normalized post-synthesis area efficiency of the best
state-of-the-art SCL-based decoder in the literature. The key
ingredients to achieve these results were to adopt the Fast-
SSC-List decoding algorithm to reduce complexity, to adapt
the unrolling technique to List-based decoding to increase
speed, and to use clock gating to greatly reduce the power
consumption.
References
[1] 3GPP TSG RAN WG1 Meeting, #87 RAN1 Chairman’s
Notes, 2016.
[2] A. Balatsoukas-Stimming, M. B. Parizi, and A. Burg,
“LLR-based successive cancellation list decoding of
polar codes,” IEEE Trans. Signal Process., vol. 63,
no. 19, pp. 5165–5179, 2015, issn: 1053-587X. doi:
10.1109/TSP.2015.2439211.
[3] B. Yuan and K. K. Parhi, “Low-latency successive-
cancellation list decoders for polar codes with multi-
bit decision,” IEEE Trans. VLSI Syst., vol. 23,
no. 10, pp. 2268–2280, 2015, issn: 1063-8210. doi:
10.1109/TVLSI.2014.2359793.
[4] J. Lin, C. Xiong, and Z. Yan, “A high throughput list
decoder architecture for polar codes,” IEEE Trans. VLSI
Syst., vol. 24, no. 6, pp. 2378–2391, 2016, issn: 1063-
8210. doi: 10.1109/TVLSI.2015.2499777.
[5] C. Xiong, J. Lin, and Z. Yan, “A multimode area-
efficient SCL polar decoder,” IEEE Trans. VLSI Syst.,
vol. 24, no. 12, pp. 3499–3512, 2016, issn: 1063-8210.
doi: 10.1109/TVLSI.2016.2557806.
[6] C. Xiong, J. Lin, and Z. Yan, “Symbol-decision suc-
cessive cancellation list decoder for polar codes,” IEEE
Trans. Signal Process., vol. 64, no. 3, pp. 675–687, Feb.
2016.
[7] G. Sarkis, P. Giard, A. Vardy, C. Thibeault, and W. J.
Gross, “Fast list decoders for polar codes,” IEEE J. Sel.
Areas Commun., vol. 34, no. 2, pp. 318–328, 2016, issn:
0733-8716. doi: 10.1109/JSAC.2015.2504299.
[8] P Schla¨fer, N. Wehn, M. Alles, and T. Lehnigk-
Emden, “A new dimension of parallelism in ultra high
throughput LDPC decoding,” in IEEE Workshop on
Signal Process. Syst. (SiPS), 2013, pp. 153–158. doi:
10.1109/SiPS.2013.6674497.
[9] P. Giard, G. Sarkis, C. Thibeault, and W. J. Gross, “237
Gbit/s unrolled hardware polar decoder,” IET Electron.
Lett., vol. 51, no. 10, pp. 762–763, 2015, issn: 0013-
5194. doi: 10.1049/el.2014.4432.
[10] A. Balatsoukas-Stimming, M. Meidlinger, R. Ghanaa-
tian, G. Matz, and A. Burg, “A fully-unrolled LDPC
decoder based on quantized message passing,” in IEEE
Workshop on Signal Process. Syst. (SiPS), 2015, pp. 1–
6. doi: 10.1109/SiPS.2015.7345024.
[11] P. Giard, G. Sarkis, C. Thibeault, and W. J. Gross,
“Multi-mode unrolled hardware architectures for po-
lar decoders,” IEEE Trans. Circuits Syst. I, vol. 63,
no. 9, pp. 1443–1453, 2016, issn: 1549-8328. doi:
10.1109/TCSI.2016.2586218.
[12] I. Tal and A. Vardy, “List decoding of polar codes,”
IEEE Trans. Inf. Theory, vol. 61, no. 5, pp. 2213–2226,
2015, issn: 0018-9448. doi: 10.1109/TIT.2015.2410251.
[13] “IEEE standard for air interface for broadband wire-
less access systems,” IEEE Std 802.16-2012 (Revision
of IEEE Std 802.16-2009), pp. 1–2542, 2012. doi:
10.1109/IEEESTD.2012.6272299.
[14] E. Arıkan, “Channel polarization: a method for
constructing capacity-achieving codes for symmetric
binary-input memoryless channels,” IEEE Trans. Inf.
Theory, vol. 55, no. 7, pp. 3051–3073, 2009.
[15] K. Niu and K. Chen, “CRC-aided decoding of
polar codes,” IEEE Commun. Lett., vol. 16, no.
10, pp. 1668–1671, 2012, issn: 1089-7798. doi:
10.1109/LCOMM.2012.090312.121501.
[16] E. Arıkan, “Systematic polar coding,” IEEE Com-
mun. Lett., vol. 15, no. 8, pp. 860–862, 2011. doi:
10.1109/LCOMM.2011.061611.110862.
[17] V. Savin, “Self-corrected min-sum decoding of LDPC
codes,” in IEEE Int. Symp. on Inf. Theory (ISIT), 2008,
pp. 146–150. doi: 10.1109/ISIT.2008.4594965.
[18] G. Sarkis, P. Giard, A. Vardy, C. Thibeault, and
W. J. Gross, “Fast polar decoders: algorithm and
implementation,” IEEE J. Sel. Areas Commun., vol.
32, no. 5, pp. 946–957, 2014, issn: 0733-8716. doi:
10.1109/JSAC.2014.140514.
[19] C. Leroux, A. Raymond, G. Sarkis, and W. Gross,
“A semi-parallel successive-cancellation decoder for
polar codes,” IEEE Trans. Signal Process., vol. 61,
no. 2, pp. 289–299, 2013, issn: 1053-587X. doi:
10.1109/TSP.2012.2223693.
[20] R. H. Dennard, F. H. Gaensslen, V. L. Rideout, E.
Bassous, and A. R. LeBlanc, “Design of ion-implanted
MOSFET’s with very small physical dimensions,” IEEE
J. Solid-State Circuits, vol. 9, no. 5, pp. 256–268, Oct.
1974. doi: 10.1109/N-SSC.2007.4785543.
