This paper presents a new recursive formulation for computing the Walsh-Hadamard Transform (WHT) that allows the generation of higher order (longer size) 2-D WHT architectures from four lower order (shorter sizes) WHT architectures. Our methodology is based on manipulating tensor product forms so that they can be mapped directly into modular parallel architectures. The resulting WHT circuits have very simple modular structure and regular topology.
INTRODUCTION
The Walsh-Hadamard Transform (WHT) has been used in many DSP, image, and video processing applications such as filter generating systems 191, block orthogonal transforms (BOTS) [5] , and block wavelet transforms [2]. Other applications in communications are in CDMA [ 11 and spread spectrum [6] . This paper proposes an efficient and cost-effective methodology for mapping WHT onto VLSI structures. The main objective of this paper is to derive a design methodology and recursive formulation for computing the multidimensional (m-d) WHT which is useful for the true modularization and parallelization of the resulting computation.
The main result reported in this paper shows that a large twodimensional 2-D WHT computation on an t? X n input image can be decomposed recursively into three stages as shown in Fig. 1 for the case n = 4. The second stage is constructed recursively from four parallel (data-independent) blocks each realizing a smaller-size WHT. The pre-additions and the postpermutations stages serve as "glue" circuits that combine the 2' lower order WHT blocks to construct the higher order WHT architecture. Observe that, we have drawn our networks such that data flows from right to left. We chose this convention to show the direct correspondence between the derived algorithms and the proposed VLSI networks. Although, as far as we know from the literature, the recursive 1- 
pn,2 = Pn,nl Pn,n2
Where 0 denotes the tensor product, I , is the identity matrix of size n, and Pn,s is an n x n binary matrix specifying an nls shuffle (or s-stride) permutation. This paper is organized as follows. In Section 2 we modify the original I-D WHT. In Section 3 we then propose the 2-D WHT recursive algorithm. Finally, we conclude our results.
THE MODIFIED FORMULATIONS OF THE 1-D WHT
In this section, we modify the original 1 -D WHT to the iterative form that allows a hardware saving without affecting the processing speed. 
The 1-D WHT Iterative Formulation
Let k = log2 n, we can write equation (6) which using property (4), can be modified to
k-1
As an example, we can express Wg as The realization of wg is shown in Fig. 2 (a).
Applying property (5) 
The 1-D WHT Recursive Formulation
Applying property (I), equation (7) X=WnI,n2x9 (13) where Wnl,n2 is the 2-D WHT transform matrix for an nl x n2 image, X and x are the output and input columnscanned vectors, respectively.
THE PROPOSED FORMULATION OF
For separable transforms, the matrix Wnl ,n2 can be represented by the tensor product form [7] Wn1,n2 =Wnl @ Wn2
where w operators, as defined by equation (7), on x , respectively.
By substituting (14) in (1 3), we have and wn2 are the row and column 1-D WHT r X = ( W n 1 @ W~, ) X .
( 1 5 )
Which using equation (7) can be expressed as Therefore, the 2-D WHT on an nl X n2 input is equivalent to a 1-D WHT on a I-D input vector of size nl x n2 that can be implemented using either the modified 1 -D iterative algorithm given by (10) or the modified I-D recursive algorithm given by
(1 1).
The Truly Recursive Formulation of the 2-D WHT
Now we will derive a truly 2-D recursive formulation of the WHT by further manipulation of equation (1 5 
Qnl,n2 = ( 9 n 1 , 2 @Inz 12) en,,,, 9 -
Rnl,n2 = ( 4 n l , n l @ I n 2 1 2 ) . 
4-

Figure 3
The reduced hardware realization of the modified 1 -d WHT algorithm Figure 4 The realization of the recursive 1-d WHT
