Submitted by Dave Burford
RAID, short for Redundant Array of Independent Disks, is a technology for managing how data is stored on multiple physical hard disks. RAID combines physical hard disks into a single virtual disk that is presented to the operating system as a single hard drive. A RAID appears to you to be a single hard drive, even though many physical hard drives might be present in the RAID. A RAID is usually present on Server computers.
There are three key RAID concepts:
-
Mirroring: This is writing/copying of data to more than one disk. Redundancy is archived by maintaining two copies of the same data on different disks. If one disk fails, the system can continue to operate by using the unaffected disk.
-
Striping: This involves splitting data across more than one disk. Data is broken down into blocks of equal-sized pieces called stripes. Each stripe created is written (concurrently) to a different disk across all physical disks in the disk array, one row at a time. Each row is called a striped set. Because the data in a striped set is written/read independently and concurrently, performance (speed) is greatly enhanced.
-
Parity/Error Correction: Here redundancy data is created and stored to allow problems to be detected and/or fixed. The redundancy data is calculated from sets of actual data values. Since computer data is stored as binary numbers (using 0’s and 1’s), we can use Boolean (logical) operators to transform data. One of these operators is Exclusive OR, written XOR. The interesting (and useful) thing about XOR is that if performed twice in a row, it "undoes itself". This allows for calculating any single missing value from a set of values.
Different RAID levels use one or more of these concepts. A number of standard schemes have evolved which are referred to as RAID levels.
The purpose of RAID is to improve performance (read/write speed) or provide data protection (redundancy). Not all RAID levels provide both speed and redundancy. The actual characteristics of a RAID depend on the RAID level.
RAID is implemented using a RAID Controller which can be either software or hardware. The RAID Controller implements the RAID level and processes all read and write disk activities. Because this activity can greatly slow computer system performance, a hardware RAID is preferred. Only RAID levels supported by the RAID Controller can be implemented.
There are 7 Standard and 7 Nested "officially" recognized RAID levels. Rather than cover all of the levels, let’s review the basic and the most popular RAID Levels.
________________________
RAID 0 Striped Disk Array
-
Requires at least 2 physical disks
-
Provides fast response but no redundancy
-
Recommended Use: Store Non-critical data that changes infrequently and is backed up regularly requiring high speed, and low cost of implementation or as temporary or "scratch" disks on larger machines.
Here two physical disks are configured as a RAID 0. A file is saved in 6 blocks A-F. Note, each block (stripe) is written to a different disk and that there are 3 stripe sets. If one disk fails all files (data) on the array is lost.

________________________
RAID 1 Mirrored Disk Array
-
Requires 2 hard drives
-
Provides complete data protection but is more expensive as twice the disk capacity must be purchased
-
Recommended Use: Situations requiring high fault tolerance without heavy emphasis on large amounts of storage capacity or top performance. Frequently used to protect disks dedicated to storing Server operating system and/or Server applications.
Here two physical disks are configured as a RAID1 Mirror. As before, the file is saved in 6 blocks A-F. Each block is written twice, once to each physical disk. If one disk fails, no data is lost because the other disk holds the data.

________________________
RAID 5 Block Striping with Distributed Parity *
-
Requires at least 3 drives.
-
Combines the Striping and Parity concepts.
-
Some disk capacity is lost storing the parity data: the capacity of the RAID is (Size of Smallest Drive) * (Number of Drives - 1)
-
Provides good response and good redundancy.
-
Recommended Use: RAID 5 is seen by many as the ideal combination of good performance, good fault tolerance and high capacity and storage efficiency. It is best suited for transaction processing and is often used for "general purpose" service, as well as for relational database applications, enterprise resource planning and other business systems.
Here three physical disks are configured as a RAID 5. As before, the file is saved in 6 blocks A-F. As in RAID 0, each block (stripe) is written to a different disk and there are 3 stripe sets. However, in addition to the data block each stripe has a Parity Block written, shown as Pn and in a dark shade. The purpose of the Party Block is for error checking. This allows for the re-creation of missing data if one of the Physical Disks fails. If a second disk fails the array fails.

________________________
RAID 10 Striping over Mirror Sets
-
Requires a minimum of 4 physical drives to implement
-
Capacity: (Size of Smallest Drive) * (Number of Drives) / 2
-
Combines the Mirroring and Striping concepts.
-
Provides high response and excellent redundancy
-
Recommended Use: Holding mission critical transaction or relational databases where both high performance and excellent redundancy are required and lack of parity calculation results in faster write speeds.
Here four physical disks are configured as a RAID10. Two sets of mirrored disks (Disks 1,3 and Disks 2,4) are striped. As before, the file is saved in 6 blocks A-F. As in RAID1 Mirror, each block is written twice, once to each physical disk but with 2 mirrors. As in RAID0 Stripe, each block (stripe) is written to a different disk and there are 3 stripe sets. This allows for the re-creation of missing data if one of the Physical Disks fails. Two disk failures in the same mirror set will cause the array to fail.
Note: This RAID is NOT the same as RAID 0+1 Mirrored Stripes!

January 2008
