String matching (SM) 
INTRODUCTION
A basic search operation on patterns is the string matching (SM). In many applications, using a special encoding method for representing strings is important and advantageous for saving storage and manipulating them. One well-known method that has been widely used in many fields and has played a valuable historical role in the development of data compression is run-length coding.
The basic idea of this method is to replace sequences of identical consecutive symbols with that representative symbol and its multiplicity. For example, the run-length coded representation of the string aaaabbbaaacccc is a4b a3c 4 (a, 4) (b, 3) (a, 3) (c, 4) , and the length is reduced from 14 to 8. A variable length don't care (VLDC) is a special symbol, not belonging to E but in E* Each VLDC in the pattern can match any substring in the text (possibly zero length). For example, given a text *Corresponding author, e-mail: klchung@cs.ntit.edu.tw. This research was supported in part by the National Science Council of R.O.C. under contracts NSC85-2121-E011-009, NSC85-2213-M011-002, NSC86-2213-E011-010 and NCHC86-08-015.
string 'cccaaaaabbaaabbbccccddaabb' and a pattern string 'aabb.cccddaa', where is the VLDC, the two matched positions are from 7 to 24 and from 12 to 24. The SM problem for run-length coded strings with VLDCs can be viewed as an extension of the classical SM problem and has many important applications [4] Figure 3 . Here, n 2 and m=6.
Algorithm_l
Step 1 Step FIGURE 3 The initial configuration of the 6 12 RM.
P(1,/) and P(2,/) to the others in the same row via the horizontal bus system.
Step 3 Each processor first disconnects its vertical and horizontal connections. Then, each processor holding the symbol ',' sends a special symbol, say '+', to its south neighbor and north neighbor. Figure 4 only illustrates the special symbols, ',' and '+ ', in each processor.
Step 4 Excepting the processors in the first row, each processor connects its N and E ports.
Step 5 Each processor in the first row, the last row, and the rows holding '+' connects its W and S ports when P(1,i)= T(1,j) and P(2, i) _< T(2, j). Otherwise do nothing.
Step 6 Each processor PE(i,j), 2_<i_<m-1, connects its W and S ports when P(1, i) T(1, j) and P(2, i) T(2, j). Otherwise do nothing.
Step 7 Each processor PE(i, j) holding ',' connects its W and S ports; connects its W and E ports. Step 8
Step 9 configuration of the RM after performing this step.
Each processor PE(m, j), < j <_ n, with the connection linking the W and S ports, first sends the symbol, say '!', from the S port to processor PE(1, k) for _< k_< n.
If each processor holding the symbol ',' receives the symbol '!' from its S port then it disconnects its W and E ports; disconnects its N and E ports if it has; connects its N and S ports. This can avoid the negative length effect. Figure 6 In our example, the processor PE(1,2) reports the data (2, 4) as the matched starting location and (8, 2) (see Fig. 7 ) as the matched ending location; PE(1,4) reports (4, 2) as the matched starting location and (8,2) (see Fig. 7 ) as the matched ending location; PE (1, 8) reports (8, 1) as the matched starting location and (12, 2) (see 2J(N 1) + N) ). In addition, processor PE(i, 1) stores the data (P(1, i), P(2,/)) for < < m.
In our example, the initial data allocation on the mxN (=MxN=6x7) RM is illustrated in Figure 8 , where in fact PE (1,7) By Lemma Our parallel algorithm process these pipes from the th pipe to the first pipe successively and processing each pipe is similar to the parallel algorithm described in Section 3. Our partitionable parallel algorithm for this case, i.e., N < n, is described below.
Algorithm_2
Step 1 Each processor PE(1,j) for <j<N broadcasts all of its own data to the other processors in the same column via the vertical bus system. It takes O(') time.
Here, we assume that each time it takes O(1) time to broadcast a data.
Step 2 and Step 3 These two steps are the same as
Step 2 and Step 3 described in Algorithm_l, respectively. Step 5 Step 6
Step 7
Step 8
Step 9
Each processor in the first row, the last row, and the rows holding '+' connects its W (E) and S ports for odd (even) X when P(1, i) T (1, 2 [(X)/(2)J (N 1) + j) (T(1, X(N 1) -j + 2)) and P(2, i) < T(2, 2 I(X)/(2)J (N 1) + j) (T (2,X(N-1)-j + 2) ). Otherwise do nothing.
Each processor PE(i, j), 2 _< < rn 1, connects its W (E) and S ports for odd (even) X when P(1, i) T(1,2 [(X)/(Z)J (N 1) + j) (T(1, X(N 1) -j + 2)) and P(2,/) T(2, 2 (X)/(Z)J (N-1) + j) (T(2, X(N-1)-j + 2)). Otherwise do nothing. Each processor PE(i, j) holding '.' connects its W (E) and S ports for odd (even) X; connects its W and E ports. For pipe X, i.e., the Xth pipe, if X is odd (even), processor PE(i,N) (PE(i, 1)), 2 _< < m, holding the data sent from pipe X + and processor PE(m,j), _< j < N, with the connection linking the W (E) and S ports first send the symbol '!' from the S port to processor PE(1,k) for _< k_< N or PE(i, 1) (PE(i,N) ). If each processor holding the symbol '.' receives the symbol '!' from its S port then it disconnects its W and E ports; disconnects its N and E (W) ports if it has; connects its N and S ports. Along the corresponding stairlike bus systems, processor PE(i,N) (PE(i, 1)), 2 <_ <_ m, holding the data sent from pipe X + and processor PE(m, j), _< j <_ N, with the connection linking the W (E) and S ports, send the received data or the data (2[(X)/(Z)J (N-1) + j, P(Z,m)) ((X(N-1) -j + 2,P(2, m))) from the S port to processor PE(1, k) for _< k _< N or PE(i, 1) (PE(i,N) ). Finally, PE(i, 1) (PE(i,N)) keeps the received data, which will be used by pipe X-1, and PE (1,k) reports the data (2 [(X)/(Z)J (N-1) + k, T(2, 2 [(X)/(Z)J (N 1) + k) P(2, 1) + 1) ((X (N-1) (-b2) in the second pipe is shared with the first entry in the third pipe. In addition, for the runlength coded pattern in Figure 10 , the last entry (= d 2) in the first pipe is shared with the last entry in the second pipe. That is, the text is arranged into a snakelike row-major order and the pattern is arranged into a snakelike column-major order.
Our algorithm process these 'I? pipes from pipe I? to pipe for each fixed pipe X, J_> X>_ 1, successively; it is similar to Algorithm -2, but we change the roles of text and pattern each other. Our partitionable parallel algorithm for Case 2 is described below. Step
Step
Step 3 1 Each processor PE(1, j) for _< j<_ N broadcasts all of its own data to the others in the same column via the vertical bus system. It takes O(2) time. Step 5
Step 6
Step 7 ports when X is odd; connects its N (S) and W ports when X is even. Each processor holding P(1, 1), P(1,m), and '+' connects its W and S (N) ports for odd X and odd (even) Y when T(1, Otherwise do nothing. Each processor PE(i, j) holding ',' connects its W and E ports; connects its W and S (N) for odd X and odd (even) Y; connects its E and S (N) for even X and odd (even) Y.
Step 8 Case (8, 2) as the matched ending location; PE (1, 4) reports the data (4,2) as the matched starting location and (8, 2) as the matched ending location (see Figs. 11 (k) and (1) 
