The Redundant Array of Independent Disks
A landmark paper, titled "A case for redundant arrays of inexpensive disks (RAID)", was presented by Patterson, Katz and Gibson in 1988 (Patterson et al., 1988) . RAID systems can be configured into various ways to get a compromised result on data access speed, system reliability and size of storage. The general trade-off is to increase data access speed by writing the data into more places, which increases the amount of storage available by a factor N. On the other hand, more disks cause lower reliability on the disk system, leading to data redundancy and the need for additional algorithms to enhance the reliability of valuable data. There are several levels in the RAID system, such as RAID-0, RAID-1, RAID-10, RAID-2, RAID-3, RAID-4, RAID-5, RAID-6, and RAID-7. The mostly used versions for the trade-off are RAID-0, RAID-10, RAID-5, and RAID-6.
2.1
The RAID-0 RAID-0, as shown in Fig. 1 , strips all data across multiple drives in a disk array. This is a high I/O performance solution, since it can simultaneously support many small/individual and large means of access with all disks transferring in parallel. Thus, very high data transfer rates (both reads and writes) may be achieved. RAID-0 is very suitable for I/O time critical or real time applications, such as video on demand (VoD) systems. However, RAID-0 maximizes the access speed and space available while being low in reliability. Because RAID-0 provides no data protection, the probability of disk failure increases with increasing number of disk drives. Any failing single drive will break the entire disk array. 2.2 The RAID-1 RAID-1 writes data to two drives simultaneously in a replicated way, as shown in Fig. 2 . If one drive fails, the data can still be retrieved from the counterpart of the RAID set. This process is also called "disk mirroring". Mirroring refers to the 100% duplication of data from one disk to another so that it is the most expensive RAID (double hardware storage required). It offers the advantage in reliability only. RAID-1 increases the reliability of protection of single disk failure, but doubles the cost for the available storage. RAID-1 is very suitable for both reliability and performance applications, such as OS disk and accountant data. 
The RAID-10
RAID-10 is a combination of RAID-0 and RAID-1, where the data are striped and mirrored as shown in Fig. 3 . This provides a higher speed and reliability, but possesses the same cost problem as RAID-1. Again, it can only tolerate single disk failure in each disk pair. 2.4 The RAID-5 RAID-5 uses a parity encoder to produce parity information and to provide data recovery at a low cost. The architecture of RAID-5 is shown in Fig. 4 . Although one disk is added, data are striped over all disks so that large files can be fetched with high bandwidth. To balance the disk access load, a rotating parity is used. Many small random blocks can be accessed in parallel without hot spots/bottlenecks in any single disk. When a single disk fails, the data can still be reconstructed from the parity information. shows the model of RAID-6. RAID-6 is different from RAID-5 as it has two additional disks to recover the loss of two disks. This capability provides a higher fault tolerant capacity for disk array. The popular techniques for RAID-6 are even-odd and RS codes, which will be discussed later. 
The Design of Reed-Solomon Codes for Error and Erasure Correction
The RS codes have been widely used in error control coding, especially in the applications of communication, satellite, and storage (Wicker, 1994) . The RS codes have maximum distance separable (MDS) and hence can extend the largest possible minimum distance for codes of their size and dimension. In addition, RS codes are good at correcting burst errors. In the following, we discuss the basic definition of the encoder and illustrate those algorithms to correct one error or two erasure conditions using the PGZ algorithm (Wicker, 1995) .
The Reed-Solomon Codes Encoder
The (n, k) RS codes over GF(2 m ) have the capability to correct k n t   2 erasures or t errors, where n is the total length of a codeword, k is the number of information symbols in GF(2 m ), and t is the number of errors that the RS codes can correct. zeros of the generator polynomial g(x) for the t-error-correcting RS code. The generator polynomial is the production of the associated minimal polynomials:
We define each codeword as a polynomial
polynomial can be expressed in terms of
The Reed-Solomon Codes Decoder
Since the discovery of RS codes, the efficient decoding algorithm has been highly used for the high-speed data process. Peterson provided the first decoding algorithm for binary BCH codes (Peterson, 1960) . Then Peterson's algorithm was improved and extended to nonbinary codes by Gorenstein and Zierler (Gorenstein & Zierler, 1961) , Chien (Chien, 1964) , and Forney (Forney, 1965) . In the following, we discuss how to solve two erasures or one error using the PGZ algorithm, and propose a fast error and erasure correction algorithm based on the PGZ algorithm. From the example, further design for tolerating more errors or erasures may be developed when needed.
The Syndrome Evaluation Algorithm
The syndrome evaluation is the first procedure in the error detection. Assume that a received codeword polynomial 
and the syndromes are
; consequently, the syndrome is actually a form of evaluation for the error pattern e(x). If there is an error e(x), the syndrome will be nonzero.
The Design of Single Random Error Correction
In the applications of high speed storage, the requirement for single random error or double erasure correction is commonly applied. The PGZ algorithm is applied to find the error locator polynomial
. We represent the syndromes with t errors in the received word as follows:
where X i is the error locator for the ith error and Y i is the magnitude of the ith error. If a random error e i has been introduced into the received word as 
From equations (6) and (7), the e i can directly affect the syndromes S 1 and S 2 . With single error, the error magnitude e i is substituted by Y i and the error location i  is substituted by
Rearranging the equations (6) and (7), we have
Finally, the direct solution of the error location X i and magnitude Y i are obtained from equations (8) and (9) 
The Design of Single or Double Erasure Correction
In the definition, the erasure means that the error location has been known. Starting from the case of double erasures, we assume those magnitudes as Y i and Y j , on the ith and jth locations in the codeword, respectively, and the effected syndromes are
By solving equations (12) and (13), the magnitudes can be obtained from the syndromes as ) ( . We obtain the location or Y i as 
The Design of Reed-Solomon Codes with Small Write Capability
In the design of highly reliable systems for banks, stock markets and hospitals, RAID-6 systems are normally applied. A problem that will affect the system is the frequent transactions or updating of data/information. Those frequent writing is a small amount of bytes compared with a block of record in kilo-bytes. This is called the small write problem which is an important factor that influences the performance of a RAID system. The small write problem is caused by the frequent modifications of partial codewords, and the parity bytes of each codeword also need to be updated. In other words, to update the parities, block data reading is needed for the task of partial data updating in RAID systems. As shown in Fig. 6 , in the RAID-5, assume the D0 is modified, so that the parity should be updated, too. For example, an inefficient method fetches the unused/whole data and encodes them again, causing a great delay and a serious problem on writing a small amount of data. Another smart algorithm is as follows: first, read the old D0 and old parity P and then perform exclusive-OR operation with the new data D0' to obtain the new party P'. Second, write the new data D0' and new parity P' back. In this proposed algorithm, we need 2 reads and 2 writes operation to update D0. On the other hand, the RAID-6 systems have 2 parities, and the modification of parities might need to fetch a block of data to calculate the new parity. This is another inefficient method that fetches more data and performs encoding again. The proposed algorithm has limited times of access so that only the changed data can be read and the calculation of the parities is performed, such as the 0 c and 1 c , which may be different from the original c 0 and c 1 in the encoder. Since the advance of VLSI, the proposed IP has absolutely become a combinational circuit to perform the calculation which will provide a high speed performance in the case of low delay and a simple interface to the current RAID systems, as presented in the following sections.
The Algorithm of Small Write Encoder
Regarding RS-RAID, according to the design of RS codes in Section 3.1, if a symbol/data in a set of codeword C(x) has been modified, the original parity symbols c 0 and c 1 have to be reencoded. Assuming the new symbol
, is an updated parity in the codeword C(x). Take i c as the coefficient of the new input, the encoder should generate the www.intechopen.com 
We also know the c 0 and c 1 are parity check symbols, thus the equations (17) and (18) 
To solve the 0  and 1  , subtract equation (19) from equation (21) and use the same way on equations (20) and (22); we obtain 
From the equations (23) and (24), it is found that we do not need the whole codeword to generate the new set of parity symbols. This is the key to calculate the new parity symbols on line. To solve 0  and 1  , the set of equations in equations (23) and (24) can be solved simultaneously, and we have ) (
Finally, the new parity check coefficient 1 c can be expressed in terms of www.intechopen.com 
This proves the decoder without a sequential stream of data. If all the elements are over GF(2 8 ), equations (26) and (27) 
By observing equations (28) and (29), we have a combinational circuit to construct a VLSI module to finally realize this function.
The RS-RAID System Design
Based on the explanations in Sections 3 and 4, the basic modules of RS codes are included to develop a reliable disk system, or RS-RAID system. The RS-RAID system design not just tolerates up to two or more disks failure but also corrects error(s) and erasures transparently. Transparency features high speed and real-time processes without complicated software control. This section will discuss the design of this RS-RAID system from its operation to system architecture.
System Design
In regard to modern mass storage systems, there are usually ten or more disks in the RAID system with less reliability or a higher risk of data loss. A reliable storage system must satisfy the following requirements: 1. High-performance disk failure recovery: It not just features high-speed access but also tolerates up to two or more disks failure. 2. Low recovery time: When one or more disks are not returning data within a limited period of time, the system control assumes that the disk(s) is/are slow disk(s) and solves its original information from the existing data/codeword. This strategy must be realized with low access or recovery time and speed up the system performance. 3. High confidence on individual disk: The disk can identify whether its data are reliable or not.
Aiming to meet the above requirements, we first use CRC 32 as part of the major checking data to judge the health of data disks. The rate of miss checking using CRC 32 is comparatively low. Secondly, a Reed-Solomon Product Code (RS-PC) is proposed with the support of CRC 32 checking bits to construct a highly reliable RS-RAID. The RS-PC is a combination of two (n, k) RS codes, denoted by inner-codes C row and outer-codes C col . The C row and C col codes are presented as the parity symbols of line blocks in row and column directions, as shown in Fig. 7 . The codes C row and C col are combined for double protection, which is a "check-on-check" of the RS-PC to enhance the error-correcting capability. Since the two parity symbols are utilized as the line blocks in rows and columns, the architecture www.intechopen.com of RS-RAID is then partitioned into dual levels, namely the system level and disk level, as shown in Fig. 8 . In this design, all disks are considered as large logical/unified storage. The host can access the data in RS-RAID through the IDE or SCSI interface. At the system level, there are n disks, and all data are encoded and decoded by L1 (level one) RS-code codec through the PC interface. This design guarantees the reliability of data reading from the large logical disk. System Cache memory, which temporarily stores the data encoded from L1 RS-code codec, is used to buffer the currently used data. Cache buffer will improve the RS-RAID performance in frequent access to/from the system. At the disk level, each disk has n stripes space. When the data are read from a disk, they are decoded by L2 (level two) RS-code codec and then checked by CRC 32 codec. If the amount of errors is too many for the correcting capability of L2 RS-code codec, this situation will be detected by CRC 32 codec. This guarantees the data reading from individual disk in a reliable condition. This system has advantages such as high capacity, throughput and reliability, because all encoding/decoding processes are operating in real time.
System Operation
For easier explanations, the operation of the system is based on the concept of the error correction design. The operations in dual levels can enhance the system reliability and increase the number of errors tolerated in the system.
The System Level
At the system level, the L1 RS-code codec can be partitioned into the encoder and the decoder parts, as shown in Fig. 9 . When bulk write from the host is performed, the information of stripe u can be expressed in terms of } , , , { ) (
and encoded into a
for each tripe of data, where
, n is the number of total disks in system, and k is the number of data disks. When the frequently rewritten data are sent to the system cache sequentially from the host, the 0 c and 1 c become new parity symbols and must be updated in real time, as shown in Fig. 9 . 
System Level Disk Level Fig. 9 . The RS codes codec block diagram of system level Before being written back to disks, all data are temporarily stored into system cache. During write back, the data are firstly encoded into the outer-code
, and n is the stripes of a disk. When read from an unreliable disk or communication channel, the received data from disk } , , , { ) ( , from the previous research in (Jing et al. 2001 ), the procedure is as follows: A. ) (x R u stripe is firstly sent to the error corrector and generates its syndromes S 1 and S 2 for error checking and correction purposes. B. When a random error occurs, the corrector will use the syndromes to calculate its magnitude Y i on position X i . C. With erasure(s), the corrector firstly sets one or both of X i and X j as the already known position(s) of erasure(s) to solve their correlated error magnitudes Y i and Y j . When an error or erasure(s) is found, the corrector will correct the ) (x R u stripe using S 1 and S 2 immediately after completion of reading. The timing of this procedure is shown in Fig. 10 . With such high-speed correction, we consider this operation as a real-time correction process. 
The Disk Level
Internal Bus Fig. 11 . The RS codes codec block diagram of system level At the disk level as shown in Fig. 11 , when the bulks are written from cache into disks, the encoding procedure at the disk level is as follows:
in cache is written into disks.
2. Each disk receives its own data I which is then encoded by CRC 32 codec. 3. The encoded data stream from CRC 32 codec are encoded again by L2 RS-codes codec and stored in the disks. Thus, the RS-PC encoding is finished. When a system reads data from disks into cache, all data will be checked by CRC 32 codec and decoded by L2 RS-codes codec. The decoding procedure at the disk level is as follows: 1. The data in each disk are decoded and corrected by its own L2 RS-codes codec. 2. The decoded data stream from each L2 RS-codes codec are decoded again by its own CRC 32 codec. Finally, if the quantity of errors is greater than the error-correcting capacity of L2 RS-codes codec, the CRC 32 codec can detect the errors and report to the system level, so this disk is marked as an erasure at the system level. This strategy will enhance the system reliability and increase the data access speed with less possibility to retry failure disk(s). This design provides support of double protection for the RAID system in real time.
Conclusion
This paper provides an example of coding to implement a RS code in redundant array of independent disks system in correcting single random error and double erasures. There are new directions such as the small write module and higher correction capabilities for the design of a RS-RAID system. As a result, the proposed RS-RAID system has the following advantages: 1. Expandable design: As the design of RAID-6, this paper does not only propose a solution for dual disks failure, but also adopt the PGZ algorithm to correct less than six or seven errors. 2. Integrated concept: This system presents a unified RSPC concept to partite the system into dual levels of abstract/structure. Thus, the modules at the disk level mainly deal with burst or random errors in disks, and the control of the system level does the correction for multiple failures of the system. On the other hand, each disk may be a surface of disk so that a fault tolerant hard disk is produced. 3. Real-time updating capability: In regard to the applications for banks, stock markets, hospitals or military purposes, the system requires frequent transactions or updating of data/information. The small write module may support the system cache with a real time requirement and solve the frequent update operations in the RAID system with very low overhead. 4. Suitability for co-design: The proposed algorithm is suitable for both hardware and software designs of the modules in the applications of RS codes by using the finite field. The math of finite field belongs to modern algebra which has been largely applied to the applications of error correction code and cryptography. This suggests that the hardware modules will be integrated into the math processor in CPU of the future versions. The reliable control may be easily integrated into microcontrollers and general processors. 5. More applications: With the advantages from the expandability to the co-design of the system, this concept may extend its applications to most memory systems such as the flash memory, DRAM, and so on.
