RAID is an acronym for Redundant Array of Inexpensive Disks (as named by the inventor), or alternately, Redundant Array of Independent Disks - a name which later developed within the computing industry. RAID is a technology that employs the simultaneous use of two or more hard disk drives to achieve greater levels of performance, reliability, and/or larger data volume sizes.
A RAID distributes data across several physical disk devices which appear to the operating system and user like a single disk. Several different arrangements are possible.
Some arrays are "redundant", writing extra data derived from the original data across other disks in the array, so that the failure of one (sometimes more) individual disks in the array will not result in data loss. In this case, the bad disk drive(s) are replaced and the data on it reconstructed from the redundant data on the other drives. A redundant array obviously allows less data to be stored. For example a 2-disk RAID 1 array loses half of its capacity, and a RAID 5 array with several disks loses the capacity of one disk. Other RAID arrays are arranged so that they are faster to write to and read from than a single disk.
There are various approaches to RAID technology, each with different trade-offs of protection against data loss, capacity, and speed. There are many level designations, but the most common RAID levels include:
RAID 0 (striped disks) distributes data across several disks in a way which gives improved speed and full capacity, but all data on all disks will be lost if any one disk fails.
RAID 1 (mirrored disks) uses two - possibly more - disks where each store the same data, so that data is not lost so long as one disk survives. Total capacity of the array is just the capacity of a single disk. The failure of one drive in the event of a hardware or software malfunction increases the chance of failure or decreases the reliability of the remaining drives (second, third, etc).
RAID 5 (striped disks with parity) combines three or more disks in a way that protects data in the event of the loss of any one disk. The storage capacity of the array is reduced by the capacity of one disk. The less common RAID 6 can recover from the loss of two disks.
RAID 6 (striped set with dual distributed parity) provides fault tolerance that allows the array to continue to operate with up to two failed drives. This makes larger RAID groups more practical, especially for high availability systems. This becomes increasingly important, because large-capacity drives lengthen the time needed to recover from the failure of a single drive.
RAID involves significant computation when reading and writing information. With a true hardware RAID, the controller does the work. In other cases, the host computer's processor to does the computing, which reduces the computer's performance on processor-intensive tasks. Simple RAID controllers may provide only levels 0 and 1, but require less processing.
RAID systems with redundancy continue working without interruption when one, or sometimes more disks in the array fail, although they are then more vulnerable to further failure. When the bad disk is replaced by a new one, the array is rebuilt while the system continues to operate normally. Some systems have to be shut down when removing or adding a drive; others support "hot swapping", allowing drives to be replaced without powering down. A RAID with hot-swap drives is often used in high availability systems, where it is important that the system keep running at all times.
It is important to understand that RAID technology is a storage solution and not an alternative to backing up data. Data can still become damaged or destroyed without harm to or malfunction of the RAID drive(s). For example, part of the data may be overwritten by a system malfunction or a file may be damaged or deleted by user error or malice and not noticed for days or weeks. Also, the physical drive array is always at risk of factors such as theft, flood, and fire.