A new bit-serial multiplier over GF(pm) using irreducible trinomials  by Chang, Nam Su et al.
Computers and Mathematics with Applications 60 (2010) 355–361
Contents lists available at ScienceDirect
Computers and Mathematics with Applications
journal homepage: www.elsevier.com/locate/camwa
A new bit-serial multiplier over GF(pm) using irreducible trinomialsI
Nam Su Chang a,∗, Tae Hyun Kim d, Chang Han Kim b, Dong-Guk Han c, Jongin Lim a
a Center for Information and Security Technologies, Korea University, Seoul, Republic of Korea
b Department of Information and Security, Semyung University, Jechon, Republic of Korea
c Department of Mathematics, Kookmin University, Seoul, Republic of Korea
d The Attached Institute of ETRI, P.O.Box 1, Yuseong, Daejeon, 305-600, Republic of Korea
a r t i c l e i n f o
Keywords:
Finite field
Irreducible trinomial
Bit-serial multiplier
Pairing-based cryptography
a b s t r a c t
Pairing-based schemes, such as identity-based cryptosystem, are widely used for future
computing environments. Hence the work of hardware architectures for GF(pm) has
been brought to public attention for the past few years since most of the pairing-
based schemes are implemented using arithmetic operations over GF(pm) defined by
irreducible trinomials. This paper proposes a new most significant elements (MSE)-first
serial multiplier for GF(pm), where p > 2, which is more efficient than least significant
elements (LSE)-first multipliers from the point of view of both the time delay and the size
of registers. In particular, the proposed multiplier has an advantage when the extension
degree of finite fieldsm is large and the characteristic of finite fields p is small like GF(3m),
GF(5m), and GF(7m) used in pairing-based cryptosystems.
© 2010 Elsevier Ltd. All rights reserved.
1. Introduction
Finite field arithmetic is an integral part of many public key algorithms, including those based on the discrete logarithm
problem in finite fields, elliptic curve-based schemes, and pairing-based schemes. In particular, (hyper-)elliptic curve
cryptosystems ((H)ECC) and pairing-based cryptosystems (PBC) use prime fields and extension fields. In particular, identity-
based schemes based on pairings are defined in finite fields of not only large characteristic but also small characteristic such
as 2, 3, 5, or 7. However until a few years ago the focus wasmainly on fields GF(p) and GF(2m). For these types of fields, both
software implementations and hardware architectures have been studied extensively.
In contrast to the GF(p) and GF(2m), there has not been a lot of work done on GF(pm). Recently, to accelerate practical
applications of pairing-based schemes, there has been a renewal of interest in hardware implementation on GF(3m). To
compute GF(3m) arithmetic, the polynomial basis was used in [1–3] and the normal basis was used in [4]. In [3], a number
of ways of implementing characteristic three arithmetic in hardware were examined. For example, they represented an
element in GF(3) by two bits and proposed arithmetic logics for an addition and a multiplication in GF(3), which require 3
XOR and 4 OR gates and 4 AND and 2 OR gates, respectively. In [2], algorithms and architectures for new GF(3m)multiplier
and inverter components were presented. In [4], the authors examined the use of normal basis arithmetic in characteristic
three in both hardware and software environments. Over the past few years, several works have focused on hardware
accelerators for pairing-based cryptosystems [5–9].
In [1], general architectureswhich are suitable for finite fieldsGF(pm)with p oddwere proposed,which is a generalization
of the work of [10] to finite fields GF(pm)with p odd. The multipliers trade performance for area by increasing the number
of coefficients that the multiplier processes at one time. In addition, the publication of [11] presented systolic digit-serial
I This work was supported by the IT R&D program of MKE/KEIT. [2009-F056-01, Development of Security Technology for Car-Healthcare]∗ Corresponding author. Tel.: +82 2 3290 4762; fax: +82 2 928 9109.
E-mail address: ns-chang@korea.ac.kr (N.S. Chang).
0898-1221/$ – see front matter© 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.camwa.2009.12.034
356 N.S. Chang et al. / Computers and Mathematics with Applications 60 (2010) 355–361
architectures for fieldsGF(pm), which are scalable in the sense that themultipliers support different fieldsGF(pm) forwhich p
is fixed andm is variable. These featuresmake themultiplier architectures suitable for ASIC aswell as FPGA implementations.
In [12], multiplication algorithms and implementation results over GF(pm) on FPGA were compared.
In many standard implementations, for efficiency of polynomial reduction, the irreducible polynomial is chosen as a
trinomial. For this reason, this paper proposes a new efficientMSE-first serial multiplier for GF(pm) generated by irreducible
trinomials. The newmultiplier has smaller time delay, compared to theMSE-first multiplier, and smaller register, compared
to the LSE-first multiplier in [12]. In particular, the proposedmultiplier is more efficient when the extension degree of finite
fields is larger.
The remainder of this paper is organized as follows. In Section 2, previous multipliers for GF(pm) are presented. In
Section 3, we propose a new MSE-first serial multiplier for GF(pm) and construct a serial hardware architecture based on
the proposed algorithm. A comparison between the new MSE-first and the previous MSE and LSE-first serial multipliers is
made in Section 4. Finally, some brief conclusion is given in Section 5.
2. Polynomial basis multiplication in GF(pm)
Let F(x) = xm + ftxt + f0 be an irreducible trinomial of degree m over GF(p), where ft , f0 ∈ GF(p). The finite field
GF(pm) is isomorphic to GF(p)[x]/(F(x)). Let α be a root of F(x). Then, the polynomial or standard basis is defined by the set
{1, α, α2, . . . , αm−1}. Then, an element A(α) of GF(pm) is represented by the polynomial basis as follows.
A(α) = am−1αm−1 + · · · + a0, ai ∈ GF(p). (1)
For A(α), B(α) ∈ GF(pm), an addition C(α) = A(α)+ B(α) is computed as
C(α) = A(α)+ B(α) =
m−1∑
i=0
(ai + bi)αi,
where ai + bi is an addition in GF(p).
A multiplication of A(α) and B(α) is computed as a multiplication of polynomials A(α) and B(α) followed by a reduction
by F(α). That is, C(α) = A(α)B(α) mod F(α). The reduction can be interleaved with the multiplication of polynomials
or separately performed after the multiplication of the polynomials is finished. In general, the reduction is achieved by
interleaved methods because non-interleaved methods require a lot of hardware resource. A basic method to implement a
multiplication for GF(pm) is a shift-and-addmethod. Multipliers can be classified into two types, depending on the direction
of a process of the coefficient of the multiplier: most significant element (MSE)-first multiplier and least significant element
(LSE)-first multiplier [1,6,12,13,10]. In this paper, we focus on MSE-first multipliers using a trinomial as an irreducible
polynomial. The trinomials are widely used in public key cryptosystems such as ECCs (Elliptic Curve Cryptosystems) and
PBCs (Pairing-Based Cryptosystems) for efficiency.
2.1. MSE-first serial multiplier in GF(pm)
MSE type multipliers are performed based on the following equation;
C(α) =
m−1∑
k=0
ckαk = A(α)B(α) mod F(α)
= ((am−1B(α)α + am−2B(α) mod F(α))α + · · ·)α + a0B(α) mod F(α).
The above equation can be performed recursively, which is computed by
C(α)← C(α)α + aiB(α) mod F(α), (2)
Therefore, one register C(α) is needed to store intermediate values. Since the degrees of aiB(α) and C(α)α are m − 1 and
m respectively, the modular reduction can be computed by only one subtraction. Algorithm 1 shows an MSE-first serial
multiplication [12].
Algorithm 1MSE-first serial Multiplication
Input: A(α) =∑m−1i=0 aiαi, B(α) =∑m−1i=0 biαi, where ai, bi ∈ GF(p)
Output: C(α) =∑m−1i=0 ciαi = A(α)B(α) mod F(α).
1: C(α)← 0
2: for i = m− 1 to 0 do
3: C(α)← C(α)α + aiB(α) mod F(α)
4: end for
5: return C(α)
A structure of the MSE-first serial multiplier is shown in Fig. 1(a). The MOD unit in Fig. 1(a) is a part which computes
C(α)α mod F(α) and can be simply implemented as Fig. 1(b) when an irreducible trinomial is used. A straightforward
N.S. Chang et al. / Computers and Mathematics with Applications 60 (2010) 355–361 357
(a) MSE-first serial multiplier. (b) MOD
Fig. 1. MSE-first serial multiplier for GF(pm) defined by trinomials.
Algorithm 2 LSE-first serial Multiplication
Input: A(α) =∑m−1i=0 aiαi, B(α) =∑m−1i=0 biαi, where ai, bi ∈ GF(p)
Output: C(α) =∑m−1i=0 ciαi = A(α)B(α) mod F(α)
1: C(α)← 0, R(α)← B(α)
2: for i = 0 tom− 1 do
3: C(α)← C(α)+ aiR(α)
4: R(α)← R(α)α mod F(α)
5: end for
6: return C(α)
operation multiplying by α without a modulo reduction is equivalent to a shift operation. Let V (α) = C(α)α, which is the
input of the MOD unit at each loop.
V (α) = C(α)α = cm−1αm + · · · + c0α. (3)
Substituting αm = −ftαt − f0 into (3) for degree reduction, we have
U(α) = V (α) mod F(α)
=
m−1∑
i=1
ci−1αi + (−ftcm−1)αt + (−f0cm−1).
Therefore, two GF(p)multiplications at−ftcm−1 and−f0cm−1 and one GF(p) addition at ct−1 + (−ftcm−1) are needed. Note
that since ft and f0 are fixed values the negations of them do not affect the computation time.
Since the critical path of the MSE-first serial multiplier depends on the ADD unit and the MOD unit, the time delay is
m MUL and 2m ADD , where MUL and ADD refer to GF(p) multiplier and adder, respectively. For area complexity, m MUL
at MUL unit, m ADD at ADD unit, and 2 MUL + 1 ADD at MOD unit are required, respectively. Therefore the total area
complexity is (m+ 2)MUL+ (m+ 1) ADD .
2.2. LSE-first serial multiplier in GF(pm)
LSE type multipliers are based on the following equation;
C(α) =
m−1∑
k=0
ckαk = A(α)B(α) mod F(α)
= a0B(α)+ a1
(
B(α)α mod F(α)
)+ · · · + am(B(α)αm−1 mod F(α)).
An LSE type multiplier based on the above equation consists of two steps. One is to compute B(α)αi and the other is to
add aiB(α)αi, after multiplying B(α)αi by ai, to accumulator C(α). Compared with the MSE multiplier, since it is possible to
compute in parallel the time complexity is reduced, but the area complexity is increased due to using an additional register,
extra control signal, mux and so on. Algorithm 2 shows an LSE-first serial multiplication [12].
358 N.S. Chang et al. / Computers and Mathematics with Applications 60 (2010) 355–361
Fig. 2. LSE-first serial multiplier for GF(pm).
A structure of the LSE-first serial multiplier is shown in Fig. 2. The MOD unit used in the LSE multiplier is same with that
used in the MSE multiplier in Fig. 1(b). The critical path depends on MUL and ADD units, i.e., mMUL and m ADD , since the
MOD unit can be treated independently. For area complexity,mMUL at MUL unit,m ADD at ADD unit, and 2MUL+ 1 ADD
at MOD unit are required, respectively. Therefore the total area complexity is (m+ 2)MUL+ (m+ 1) ADD and requires an
additional register to restore R(α).
3. Proposed multiplication over GF(pm)
In this section we propose a new serial MSE-first multiplier for GF(pm) using trinomials as an irreducible polynomial.
The basic idea is to efficiently parallelize MOD unit in the MSE multiplier.
If an irreducible trinomial is used at Step 3 of Algorithm 1, we have
C(α)α mod F(α) =
m−1∑
i=0
ciαi+1 mod F(α)
= cm−1αm +
m−2∑
i=0
ciαi+1 mod F(α)
= (ct−1 − cm−1ft)αt − (cm−1f0)+
m−2∑
i=0,i6=t−1
ciαi+1. (4)
Based on this equation, the ith step of Algorithm 1 can be expressed as follows;
C(α)(i) = C(α)(i+1)α + aiB(α)
=
m−1∑
j=0
c(i+1)j α
j+1 + aiB(α) mod F(α) (5)
where C(α)(m) = 0 and C(α)(0) is the output. Let c(i)j denote the jth coefficient of C(α)(i). By the modular reduction, we have
C(α)(i) =
m−1∑
j=0
c(i+1)j α
j+1 +
m−1∑
j=0
aibjαj mod F(α)
= (c(i+1)t−1 − c(i+1)m−1 ft + aibt)αt + (aib0 − c(i+1)m−1 f0)+
m−2∑
j=0,j6=t−1
(c(i+1)j + aibj+1)αj+1
=
m−1∑
j=0
c(i)j , (6)
where c(i+1)m αm−1 = −c(i+1)m−1 ftαt − c(i+1)m−1 f0.
For A(α) ∈ GF(pm), we denote
A¯(α) = A(α)− at+1f −1t F(α)α, A˜(α) = A(α)α,
where A(α) = A¯(α) mod F(α).
N.S. Chang et al. / Computers and Mathematics with Applications 60 (2010) 355–361 359
Theorem 1. For A(α), B(α) ∈ GF(pm), the multiplication is computed by
A(α)B(α) = (A(α)αB(α) mod F(α)α)/α
= (A(α)αB¯(α) mod F(α)α)/α,
where B(α) = B¯(α) mod F(α)α. The (t + 1)th coefficient of B¯(α) is 0.
Proof. By the definition, B¯(α) = B(α)− bt+1f −1t F˜(α). Therefore, B(α) ≡ B¯(α) mod F(α)α. The (t + 1)th coefficient of B¯(α)
is
bt+1αt+1 − bt+1ft f −1t αt+1 = (bt+1 − bt+1)αt+1
= 0. 
By Theorem 1, step 3 of Algorithm 1 can be changed into C(α) ≡ C(α)α + a˜iB¯(α) mod F˜(α). Since the degree of F˜(α) is
m+ 1, the degree of the intermediate result is alsom+ 1. Eq. (6) can be changed as follows;
C(α)(i) = C(α)(i+1)α + aiB¯(α) mod F(α)α
= (c˜(i+1)m+1 + a˜ib¯m+1)αm+1 +
m∑
j=0
(c˜(i+1)j + a˜ib¯j)αj mod F˜(α). (7)
Let δ(i) = c˜(i+1)m+1 + a˜ib¯m+1. We then have
δ(i)αm+1 +
m∑
j=0
(c˜(i+1)j + a˜ib¯j)αj mod F˜(α) ≡ −δ(i) f˜t+1αt+1 − δ(i) f˜1α +
m∑
j=0
(c˜(i+1)j + a˜ib¯j)αj.
Therefore, the coefficients of C(α)(i) can be computed as follows;
c(i)j ≡

c˜(i+1)j + a˜ib¯j if 0 ≤ j ≤ m, j 6= t + 1, 1
c˜(i+1)t+1 + a˜ib¯t+1 − δ(i) f˜t+1 if j = t + 1
c˜(i+1)1 + a˜ib¯1 − δ(i) f˜1 if j = 1.
(8)
Theorem 2. Let λ(i) = a˜i+1b¯0 + a˜ib¯1. Then
c(i)j ≡

c˜(i+1)j + a˜ib¯j if 0 ≤ j ≤ m, j 6= t + 1, 1
c˜(i+1)t+1 − δ(i) f˜t+1 if j = t + 1
λ(i) − δ(i) f˜1 if j = 1.
Each coefficient of C(α)(i) is computed by oneMUL and one ADD if δ(i) and λ(i) are precomputed at the previous step in advance.
Proof. First, we show that λ(i) and δ(i) can be precomputed in the previous loop. By definition of λ(i), it can be precomputed
because a˜i+1, b¯0, a˜i, and b¯1 are the input values. And c˜(i+1)m+1 = c(i+1)m = c˜(i+2)m + a˜i+1b¯m and b¯m = 0 if an irreducible trinomial
is used. Therefore, δ(i) = c˜(i+1)m+1 + a˜ib¯m+1 = c˜(i+2)m + a˜ib¯m+1 and it can be computed at the previous loop in advance.
Now we lead this theorem from Eq. (8). When 0 ≤ j ≤ m, j 6= t + 1, 1, it is clear. When j = t + 1, since the (t + 1)-th
coefficient of B¯(α) is 0, c(i)t+1 = c˜(i+1)t+1 − δ(i) f˜t+1. Also, when j = 1, c(i)1 = c˜(i+1)1 + a˜ib¯1 − δ(i) f˜1 and c˜(i+1)1 = c(i+1)0 = a˜i+1b¯0.
Therefore, c(i)1 = λ(i) − δ(i) f˜1. 
Algorithm 3 expresses the multiplication method for a new MSE-first bit-serial multiplier.
The proposed algorithm performs m + 3 times additions and multiplications for GF(pm). In both addition and
multiplication, step (a), (b), and (c) areworking in parallel. Step (a) initializes B¯(α), i.e., computes B¯(α) = B(α)−bt−1ftF(α)α.
b¯m−1 and b¯1 are updated by F(α), b¯m and b¯t+1 are zero, and the remaining b¯i is equal to bi. Therefore, b¯m−1 and b¯1 have only to
be computed. Step (b) computes c(i)j of Theorem 2. Step (c) computes δ
(i) and λ(i), which are used in the next loop. Therefore,
the algorithm initializes B¯(α) when i = m + 2 and prepares (a) and (c) by assigning a˜m+2 and a˜m+1 to 0 when i = m + 1.
The hardware architecture of the proposed MSE-first serial multiplier is shown in Fig. 3.
The register C stores an intermediate value of µm bits, where µ is bit size of p, and the register δ, λ, b¯m+1, and b¯1 are all
µ bits. Step (a) and (c) need 4 multipliers and 3 adders for GF(p) and perform with (b) in parallel.
360 N.S. Chang et al. / Computers and Mathematics with Applications 60 (2010) 355–361
Algorithm 3 NewMSE-first serial Multiplication
Input: A˜(α) = A(α)α =∑m+2i=0 a˜iαi = (0, 0, am−1, . . . , a0, 0), B(α) =∑m−1i=0 biαi = (bm−1, . . . , b0), where ai, bi ∈ GF(p),
and−ft ,−f −1t ,−f0,−f −1t f0 ∈ GF(p)
Output: C(α) =∑m−1i=0 ciαi = A(α)B(α) mod F(α)
1: δ← 0, λ← 0, b¯m+1 ← 0, b¯0 ← 0
2: for i = m+ 2 to 0 do
3: 1. Multiplication
4: (a) (Initialize B¯(α)) b¯m+1 ← bt+1(−f −1t ),
temp1 ← bt+1(−f −1t f0)
5: (b) V (α)←∑m−1j=0,j6=t+1,1 a˜ibjαj, vt+1 ← δ(−ft),
v1 ← δ(−f0)
6: (c) (Precomputation) temp2 ← b¯m+1a˜i−1,
temp3 ← b¯1a˜i−1
7: 2. Addition
8: (a) (Initialize B¯(α)) b¯1 ← temp1+ b1
9: (b) C(α)←∑m−1j=2 (vj + cj−1)αj, cm ← cm−1, c1 ← v1 + λ,
c0 ← v0
10: (c) (Precomputation) δ← temp2 + cm−1,
λ← temp3 + c0
11: end for
12: return C(α)/α
Fig. 3. NewMSE-first serial multiplier for GF(pm).
Table 1
A complexity comparison of bit-serial multipliers for GF(pm).
Area complexity Critical path delay Latency (# clocks)
ADD MUL Register
MSE-first multiplier [12] m+ 1 m+ 2 µm 1MUL+ 2 ADD m
LSE-first multiplier [12] m+ 1 m+ 2 2µm 1MUL+ 1 ADD m
NewMSE-first multiplier m+ 2 m+ 4 µ(m+ 4) 1 MUL+ 1 ADD m+ 3
ADD: adder for GF(p),
MUL: multiplier for GF(p).
4. Comparison
In this section, we compare the proposed MSE-first multiplier with existing bit-serial multipliers. As seen from Table 1,
the area complexity of the proposed multiplier increases by 1 ADD , 2 MUL, and 4µ-bit register. However, if the proposed
multiplier is implemented in finite fields with small characteristic such as GF(3m), GF(5m), and GF(7m), which is widely
used in pairing-based cryptosystems, then the increment by 1 ADD , 2 MUL, and 4µ-bit register could not much affect the
area complexity. For example, whenm ≥ 60, the increment of the area complexity is approximately 2.5%.
The time delay of the proposedmultiplier is reduced frommMUL+2mADD to (m+3)MUL+(m+3)ADD . For example,
if the time delay of ADD andMUL are equal then the total time delay is approximately reduced by 30% whenm ≥ 60. Also,
N.S. Chang et al. / Computers and Mathematics with Applications 60 (2010) 355–361 361
Table 2
A complexity comparison of bit-serial multipliers for GF(3m).
Area complexity Critical path delay Latency
(# clocks)
ADD MUL Register
MSE-first multiplier [14,6,15,2] (4m+4)OR+(3m+3)XOR (2m+ 4)OR+ (4m+ 8)AND 2m 1AND+ 3OR+ 4XOR m
LSE-first multiplier [1,2] (4m+4)OR+(3m+3)XOR (2m+ 4)OR+ (4m+ 8)AND 4m 1AND+ 2OR + 2XOR m
NewMSE-first multiplier (4m+8)OR+(3m+6)XOR (2m+8)OR+(4m+16)AND 2m+ 8 1AND+ 2OR + 2XOR m+ 3
the proposedmultiplier has the same time delay compared to the previous LSEmultiplier [12]. However, the newmultiplier
uses half of registers used in the previous LSE multiplier [12].
Most of the pairing-based cryptosystems are implemented using arithmetic operations over GF(3m), and so we focus on
the case. For comparison, we refer the result in [4,3]. As explained in [4,3], for p = 3, ADD andMUL require 3XOR+4OR and
4AND+2OR, respectively. As seen in Table 2, the proposed bit-serialmultiplier is comparable to existing bit-serialmultipliers
based on irreducible trinomials. The critical path delay of the proposed bit-serial multiplier is shorter than existing MSE
multipliers [14,6,15,2]. Note that the basic GF(3) arithmetic operations on most of Altera or Xilinx FPGAs requires two 4-
input Look-Up Tables (LUTs). Hence, if we consider FPGA implementation, the critical path of the proposed architecture is
two LUTs compared with existing MSE-first multiplier whose critical path is three LUTs. Moreover, the newmultiplier uses
half of registers compared with existing LSE multipliers in [1,2]. Therefore, the proposed multiplier is very efficient in terms
of area–time complexity when trinomials are used for constructing finite fields.
5. Conclusion
In this paper, we have considered an MSE-first serial multiplier for GF(pm) and optimized the MSE-first multiplier
using an irreducible trinomial. In public key cryptosystems such as elliptic curve cryptosystems (ECCs) and pairing-based
cryptosystems (PBCs), trinomials or pentanomials are used due to efficiency. Therefore, if the proposedMSE-first multiplier
is used in such systems then we can efficiently implement the systems in the context of area–time complexity.
References
[1] G. Bertoni, J. Guajardo, S. Kumar, G. Orlando, C. Paar, T. Wollinger, Efficient GF(pm) arithmetic architectures for cryptographic applications, in: CT-RSA
2003, in: LNCS, vol. 2612, Springer-Verlag, 2003, pp. 158–175.
[2] T. Kerins, E.M. Popovici, W.P. Marnane, Algorithms and architectures for use in FPGA implementations of identity based encryption schemes, in: FPL
2004, in: LNCS, vol. 3203, Springer-Verlag, 2004, pp. 74–83.
[3] D. Page, N. Smart, Hardware implementation of finite fields of characteristic three, in: CHES 2002, in: LNCS, vol. 2523, Springer-Verlag, 2003,
pp. 529–539.
[4] R. Granger, D. Page, M. Stam, Hardware and software normal basis arithmetic for pairing based cryptography in characteristic three, IEEE Transactions
on Computers 54 (7) (2005) 852–860.
[5] G. Bertoni, L. Breveglieri, P. Fragneto, G. Pelosi, Parallel hardware architectures for the cryptographic tate pairing, in: Proceedings of the Third
International Conferenceon Information Technology: New Generations, ITNG’06, 2006, pp. 186–191.
[6] J. Beuchat, M. Shirase, T. Takagi, E. Okamoto, An algorithm for the ηt pairing calculation in characteristic three and its hardware implementation, in:
18th IEEE International Symposium on Computer Arithmetic, ARITH-18, 2007, pp. 97–104.
[7] P. Grabher, D. Page, Hardware acceleration of the tate pairing in characteristic three, in: CHES 2005, in: LNCS, vol. 3659, Springer-Verlag, 2005,
pp. 398–411.
[8] T. Kerins, W. Marnane, E. Popovici, P.S.L.M. Barreto, Efficient hardware for the tate pairing calculation in characteristic three, in: CHES 2005, in: LNCS,
vol. 3659, Springer-Verlag, 2005, pp. 398–411.
[9] R. Ronan, C. Ó hÉigeartaigh, C. Murphy, M. Scott, T. Kerins, W. Marnane, An embedded processor for a pairing-based cryptosystem, in: Proceedings of
the Third International Conferenceon Information Technology: New Generations, ITNG’06, 2006, pp. 192–197.
[10] L. Song, K. Parhi, Low energy digit-serial/parallel finite field multipliers, Journal of VLSI Signal Processing 19 (2) (1998) 149–166.
[11] G. Bertoni, J. Guajardo, G. Orlando, Systolic and scalable architectures for digit-serial multiplication in fields gf (pm), in: INDOCRYPT 2003, in: LNCS,
vol. 2904, Springer-Verlag, 2003, pp. 349–362.
[12] J. Beuchat, T. Miyoshi, Y. Oyama, E. Okamoto, Multiplication over Fpm on FPGA: A survey, in: ARC-2007, in: LNCS, vol. 4419, Springer-Verlag, 2007,
pp. 214–225.
[13] J. Guajardo, T. Güneysu, S. Kumar, C. Paar, J. Pelzl, Efficient hardware implementation of finite fields with applications to cryptography, Acta
Applicandae Mathematicae 93 (1–3) (2006) 75–118.
[14] J. Beuchat, N. Brisebarre, M. Shirase, T. Takagi, E. Okamoto, A coprocessor for the final exponentiation of the ηT pairing in characteristic three, in:Waifi
2007, in: LNCS, vol. 4547, Springer-Verlag, 2007, pp. 25–39.
[15] T. Kerins,W.P.Marnane, E.M. Popovici, Hardware accelerators for pairing based cryptosystems, IEE Proceedings on Information Security 152 (1) (2005)
47–56.
