This paper presents a graph-based approach to designing arithmetic circuits over Galois fields (GFs) using normal basis representations. The proposed method is based on a graph-based circuit description called Galois-field Arithmetic Circuit Graph (GF-ACG). First, we extend GF-ACG representation to describe GFs defined by normal basis in addition to polynomial basis. We then apply the extended design method to Massey-Omura parallel multipliers which are well known as typical multipliers based on normal basis. We present the formal description of the multipliers in a hierarchical manner and show that the verification time can be greatly reduced in comparison with those of the conventional techniques. In addition, we design GF exponentiation circuits consisting of the Massey-Omura parallel multipliers and an inversion circuit over composite field GF(((2 2 ) 2 ) 2 ) in order to demonstrate the advantages of normal-basis circuits over polynomial-basis ones.
Introduction
Applications of arithmetic operations over Galois fields (GFs) have been rapidly increasing owing to the high demands of reliable/secure communications and transactions using ECC (error correction code) and cryptographic operations [2] . These operations are often implemented on hardware in recent embedded devices, such as smart cards and cell phones, and the performance and dependability of arithmetic circuits have a significant impact on the entire processors. Currently, many hardware algorithms on GF arithmetic have been devised and some of such algorithms can be implemented by a multiple-valued logic more efficiently than by the binary logic.
On the other hand, most of such arithmetic circuits are designed at the logic level by researchers who had trained in a particular way to understand GF arithmetic. The conventional Hardware Description Languages (HDLs) do not have high-level arithmetic data structures, arithmetic operations and formulae over GFs. This sometimes requires us to describe structural details of arithmetic circuits by hand at the lowest level of abstraction (i.e., AND-XOR expressions) in a flattened manner. In addition, the functional verification using the conventional logic simulation is quite timeconsuming since these operations are usually performed Manuscript received November 29, 2013. † The authors are with the Graduate School of Information Sciences, Tohoku University, Sendai-shi, 980-8579 Japan.
* A preliminary version of this paper appeared in the IEEE 43rd International Symposium on Multiple-Valued Logic (ISMVL 2013) [1] .
a) E-mail: okamoto@aoki.ecei.tohoku.ac.jp DOI: 10.1587/transinf.2013LOP0012
with more-than 64-bit operands. The test pattern generation is also difficult since it varies with the irreducible polynomial even for the same operation (e.g., multiplication). In earlier related research, the formal verification of arithmetic circuits was primarily performed based on Decision Diagrams (DDs) and Binary Moment Diagrams (BMDs) [3] - [5] . However, conventional approaches are basically limited not only to binary arithmetic over integers, but also to rather small circuits. Although Binary Decision Diagrams (BDDs) can also be applied to GF arithmetic, BDDs are known to be ineffective for XOR-based logic circuits * * . There is a decision diagram specified for Galois fields based on the decomposition of multiple-valued functions [6] , but it is difficult to handle practical fields such as GF (2 16 ) and GF (2 32 ) and apply it to the formal verification. GF (2 m ) arithmetic circuits were successfully verified in a few previous studies [7] , [8] ; however, the application of the verification method appears to be limited to the specific GF(2 m ) circuits whose reference (i.e., equivalent) circuits can be prepared in advance.
To address the above problems, a formal design and verification method of arithmetic circuits over GFs was proposed [9] and [10] . The proposed idea is to use a high-level mathematical graph associated with variables and arithmetic formulae over GFs, which is called Galois-field Arithmetic Circuit Graph: GF-ACG. Using GF-ACGs, we can describe any GF arithmetic circuit in a hierarchical manner as a combination of arithmetic sub-circuits (graphs). Such description is formally verified by checking for every sub-circuit whether the function is obtained from the internal structure. The equivalence checking can be performed by formula manipulations based on a polynomial reduction algorithm using Gröbner Basis [11] , which makes it possible to verify practical arithmetic circuits in a short time. On the other hand, the previous works in [9] and [10] were limited to GF arithmetic represented by polynomial basis.
This paper presents an extension of GF-ACGs to designing arithmetic circuits over GFs represented by normal basis (NB). The space and time complexities of GF arithmetic operations heavily depend on how the field elements are represented. The NB representation is useful for designing GF arithmetic circuits such as inversion circuits and exponentiation circuits since the squaring operation based on NB representation is performed only by wiring. In this paper, we first present the extension of GF-ACGs to design and verify GFs represented by NB in addition to PB, and apply the extended GF-ACG to the formal description of MasseyOmura multipliers. The advantage of the proposed method is evaluated through the experimental verification of the designed multipliers. We also design a set of exponentiation circuits using the designed multipliers and a multiplicative inversion circuit over GF (((2  2 ) 2 ) 2 ) in order to evaluate the performance of NB-based circuits in comparison with that of PB-based ones. In addition, we further extend GF-ACG to composite fields based on NB and apply it to the formal design and verification of a multiplicative inversion circuit. Note that the preliminary version [1] studied only for prime and extension fields.
Galois-Field Arithmetic Circuit Graph
This section briefly describes the graph-based representation of GF arithmetic circuits, where the graphs are referred to as GF Arithmetic Circuit Graphs (GF-ACGs). Figure 1 shows an overview of a GF-ACG. A GF-ACG G is defined as (N, E), where N is a set of nodes, and E is a set of directed edges. The node represents an arithmetic circuit by its functional assertion and internal structure. The directed edge represents the flow of data between nodes, and defines the data dependency. We assume that every node has at least one edge connection.
A node n (∈ N) is defined by (F, G ), where F is the functional assertion given as a set of equations over GFs (GF equations) and G is the internal structure given as a smaller GF-ACG. A node at the lowest level of abstraction, which does not have its internal structure, is described as (F, nil). A functional assertion is represented as a relation E l = E r , where E l and E r are the output and input expressions, respectively, and each expression is given by variables, constants or combinations of the two or more expressions connected by arithmetic operations +, −, ×, and /.
A directed edge e (∈ E) is defined as (src, dest, x), where src and dest represent the start and end node, respectively, and x represents the variable indicating an element of GF. If either src or dest is nil, its directed edge represents an external input or output for the given GF-ACG. Each variable is associated with a Galois field. A Galois field GF based on polynomial basis (PB) is defined as (B, C, IP), where B is the basis, C is the coefficient vector, and IP is the irreducible polynomial. More precisely, B, C, and IP are given as Fig. 1 Galois-field arithmetic circuit graph.
where β is the indeterminate element, C i is the coefficient set of degree i, m is the degree of field extension, and c i is the element of the coefficient set C i . IP = nil if the GF is a prime field. Thus, the above description can handle both prime and extension fields. Let h (0 ≤ h ≤ m − 1) and l (0 ≤ l ≤ h) be the most and least significant degrees, respectively. A variable is represented as x = (GF, (h, l)), where the tuple (h, l) is called the degree range. Using the above notation, we can handle a specific variable x i of degree i. A variable is represented as an expression at a lower level of abstraction. Let x be a variable and x i (l ≤ i ≤ h) be a lower-level variable. We have two types of decomposition nodes whose functions are given as
Equation (4) indicates that
On the other hand, Eq. (5) indicates that x ∈ GF(p m ) is divided into a number of variables over the prime field (i.e.,
We also have two types of composition nodes given as inverse relations between the above inputs and outputs. Using the decomposition/composition nodes, we can change the level of abstraction in edge representation. Note here that these nodes are implemented by wiring and have no internal structures.
The above GF-ACG can be used also for representing any logic circuit. A logic variable is considered as a variable over the GF whose coefficient set is limited to the zero element "0" and the unit element "1". Any logical operation can be represented with pseudo logic equations. For example, the functions of AND and XOR circuits are given as
respectively. Note that the idempotent law is considered as one of functional assertions in the corresponding node (i.e., a = a 2 and b = b 2 ). Thus, GF-ACG can represent any arithmetic circuit over GF represented by PB and any logic circuit. The arithmetic circuits given by GF-ACGs are verified by a formal verification method using Gröbner Basis and a polynomial reduction technique. (See [9] for the detailed verification procedure.)
Extension to Normal Basis Presentation
This section presents an extension of GF-ACGs to arithmetic circuits over GFs represented by normal basis (NB).
Let α be the indeterminate element β raised to the n-th power (i.e., α = β n ), where the elements α
are linearly independent over GF(q) [12] , [13] . A normal
, where q is a power of prime number. It is well known that there is a normal basis for any positive integer m. Any field element is represented as a linear combination of the elements in a normal basis. For example, consider the finite field GF(2 3 ) generated by the irreducible polynomial β 3 + β + 1. If we choose α = β 3 , we can say that (α 4 , α 2 , α) is a normal basis. In order to handle NB representation, we introduce the expression of basis B by α instead of β. More precisely, a Galois field GF(= (B, C, IP)) based on NB is defined by
According to the extension, the expression of the second decomposition node given by Eq. (5) is also extended to
The corresponding composition node, as is the case for PB, is given as the inverse relation between the above input and output. Using the decomposition and composition nodes, we can also change the level of abstraction in any edge representation based on NB. Note here that we do not need to change the representation of any logic circuit even if we use the extended GF-ACG. As a result, we can apply the extended GF-ACG to any arithmetic circuit over GFs represented by NB. The formal verification method in [9] is also extended due to the extended description. Figure 2 shows the extended algorithm, where GroebnerBasis(P) indicates Buchberger's algorithm to obtain a Gröbner Basis GB from a set of polynomials P. Given a functional assertion f and internal structure G, P is generated from functional assertions (i.e., F) in the internal structure. In the extended algorithm, we minimize the degree of F by Minimization(F) if the F includes the terms of the indeterminate elements. GB is then obtained from GroebnerBasis(P).
Buchberger's algorithm sometimes takes a long time and requires large memory space. The degree of F is a major factor to increase its computation time since the number of polynomial reductions in the algorithm is dependent on the degree. As a result, the above minimization significantly reduces the computation time to generate GB. If the normal form of f with respect to GB is equal to zero, f is a member of the ideal from P. This means that the functional assertion can be realized with the internal structure. Therefore, this verification algorithm returns true.
Design and Verification of Massey-Omura Parallel Multipliers
This section presents the application of the extended GF-ACG to the design and verification of parallel multipliers based on NB representation. The Massey-Omura parallel multiplier [14] is a 2-input 1-output parallel multiplier over GF (2 m ) represented by NB, which has an efficient structure reducing the redundancy of a well-known Massey-Omura multiplier [15] . Let a and b ∈ GF(2 m ) be the inputs and let c ∈ GF(2 m ) be the output. Let a 
where
For the GF-ACG design, we derive a hierarchical description from the above flattened description. First, Eq. (12) is simplified as follows:
Here, the terms in the parenthesis are given as
The operation of Massey-Omura parallel multiplier is finally represented by the following two equations:
This suggests that at the 2 nd -level of the hierarchy, a MasseyOmura parallel multiplier is represented by a GF-ACG with two nodes performing the operations corresponding to Eqs. (15) and (16), respectively. Equation (15) is then represented by
where a and these adders are given by 2-input 1-output adders over GF (2) . For the 4 th -level description, let a 
Thus, the node of Eq. (17) (2) , and the node of Eq. (19) is given by some 2-input 1-output adders over GF (2) . Figure 3 shows the GF-ACGs for the Massey-Omura parallel multiplier over GF (2 3 ), where the GF-ACGs are represented by five levels of abstraction. The nodes in Figs. 3 (a) , (b), (c) and (d) correspond to the shaded parts in Figs. 3 (b) , (c), (d) and (e), respectively. Here, "GFA0" and "GFA1" in Figs. 3 (c), (d) correspond to G 16 and G 17 in Fig. 3 (e) , respectively. Table 1 shows the details of nodes, GFs and variables used in Fig. 3 . In this example, α is β raised to the cube (i.e., α = β 3 ). Note that the decomposition/composition nodes are not shown in Table 1 . Table 1 Nodes, Galois fields, and variables for GF(2 3 ) Massey-Omura parallel multiplier in Fig. 3 .
The 2 nd -level nodes "Partial Product Generator" and "Accumulator" in Fig. 3 (b) have functional assertions corresponding to Eqs. (15) and (16), respectively. The 3 rd -level nodes "PPGi" in Fig. 3 (c) have functional assertions corresponding to Eq. (17). The nodes "GFA0" and "GFA1" in Figs. 3 (c) and (d) indicate 2-input 1-output adders over GF (2 3 ) to construct "Accumulator". The 4 th -level nodes "SubPPGi" and "SubACCl" in Fig. 3 (d) have functional assertions corresponding to Eqs. (18) and (19), respectively. If the number of δ satisfying δ i, j,k = 1 is one, s i, j becomes w
in the functional assertion of "SubPPGi" instead of Eq. (19). It is important to note that we can simply extend the above GF-ACG description to describe any Massey-Omura parallel multiplier over GF(2 m ) (2 ≤ m). In order to demonstrate the capability of the proposed method, we verify a set of the designed Massey-Omura parallel multipliers over GF(2 m ) (2 ≤ m ≤ 64). In this experiment, we performed the proposed verification techniques using Risa/Asir on a Linux PC with an Intel Xeon E5450 3.00 GHz processor and 32 GB RAM. Both the original algorithm and the extended algorithm were performed in the same condition. For comparison, we also performed the Verilog-XL simulation using the corresponding HDL descriptions. Table 2 shows the verification results. We were not able to succeed the complete simulation of GF(2 16 ) and larger multipliers in this experiment because the verification (a) Verilog-XL simulation, (b) previous work [9] , (c) this work time increases exponentially as the signal length increases. On the other hand, using our extended method, we were able to succeed the complete verification even for the 64-bit multiplier over GF(2 64 ).
Application to Exponentiation Circuits over GF(2 m )
This section applies the extended GF-ACG to GF(2 m ) exponentiation circuits given by NB representation and shows the performance of them. One major feature of NB representation is that the squaring operation is done by a cyclic shift (i.e., wiring) without any hardware component. A set of GF exponentiation circuits designed here include such squaring operations depending on the exponent. Table 3 Nodes, Galois fields, and variables for cubic circuit in Fig. 4 .
We design such exponentiation circuits based on NB by the GF-ACGs. The Massey-Omura parallel multipliers described in the above section are used for the multiplication, and the graphs performing the cyclic shift are added for the squaring. Figure 4 shows an example of the GF-ACGs for a cubic circuit given as c = a 3 . Table 3 shows the details of nodes, GFs and variables used in Fig. 4 . Note here that Cyclic Shift is implemented by wiring and have no internal structures.
The area and delay of the exponentiation circuits were evaluated using Synopsys Design Compiler with a TSMC 65-nm cell library. The extension degree used in this experiment was 8 (i.e., GF (2 8 ) ). For comparison, we also designed the corresponding exponentiation circuits based on PB representation presented in [9] . Figure 5 shows the area and delay of the exponentiation circuits, respectively. We confirmed here that as the exponent b increased, the area and the delay of PB-based exponentiation circuits increased by O(log b), because they were constructed by a tree structure of some multipliers. On the other hand, the NB-based exponentiation circuits showed better performance the PB-based ones for both area and delay because squaring operations were free of cost in the NB-based circuits.
Application to Inversion Circuit over GF(((2
This section presents a further extension of GF-ACGs to composite fields based on NB and shows an application of the extended GF-ACG to a multiplicative inversion circuit over composite field GF((( 2 2 ) 2 ) 2 ) that can be implemented more compactly than the counterpart based on PB [16] .
In order to describe a composite field based on NB, the representation of coefficient sets is extended in such a way as to include all the elements of its basic field. In the following, we present the GF(( 2 2 ) 2 ) description as an example. Let GF(2 2 ) be the basic field, given as
The composite field GF(( 2 2 ) 2 ) is then given as
where the elements of GF(2 2 ) are included with the primitive element γ 0 in the exponential representation. Figure 6 shows a GF-ACG for the inversion circuit at three levels of abstraction, and Table 4 shows the nodes, GFs and variables in Fig. 6 . The "Inversion" in Fig. 6 (a) is the highest-level node. Each node exhibits an internal structure given as a combination of lower-level nodes in the corresponding shaded part. Note again that decomposition/composition nodes are not shown in Table 4 .
The functional assertion of the "Inversion" is given as y = x 254 according to the definition of multiplicative inversion. The circuit outputs a value of zero when the input is zero. As shown in Fig. 6 (b) , the internal structure consists of three multipliers, two adders, one squaring coefficient multiplier and one inverter over GF(( 2 2 ) 2 ). Each circuit over GF((2 2 ) 2 ) is recursively described with lower-level GF(2 2 ) circuits, which are shown in Fig. 6 (c) . The lowerlevel nodes in Fig. 6 (c) are also described with the lowestlevel nodes over GF (2) . The "Inversion" was verified with the proposed verification technique in about 2.5 s on the PC mentioned in Sect. 4 .
The area and delay of the inversion circuit described in Fig. 6 and the corresponding inversion circuit based on PB in [10] were evaluated under the same condition mentioned in Sect. 4. Table 5 shows the comparison result. We confirmed that the NB-based inversion circuit showed better performance than the PB-based inversion circuit. This suggests the feasibility and advantage of the extended design and verification method. Table 4 Nodes, Galois fields, and variables for the GF(((2 2 ) 2 ) 2 ) inversion circuit in Fig. 6 . 
Conclusion
This paper presented a formal design of GF arithmetic circuits represented by normal basis (NB). First, we extended GF-ACG to describe any GF based on NB in addition to polynomial basis (PB) and presented a formal design of Massey-Omura parallel multipliers with the extended GF- ACG. The experimental result showed that the verification time was greatly reduced as compared with that of the conventional methods. For example, a multiplier over GF (2 64 ) was verified within 7 minutes. For another application, we also designed a set of NB-based exponentiation circuits and evaluated the performance in comparison with that of the corresponding PB-based circuits. In addition, we presented a further extension of GF-ACG to composite fields based on NB and applied it to an inversion circuit based on GF(( 2 2 ) 2 ). The proposed method is applicable for both binary and multiple-valued implementations since the GF-ACG description is technology-independent except for the lowest-level description. The formal design of GF arithmetic circuits based on both PB and NB would remain in the future study.
