Netsted RAID Confusion
Is it RAID 0+1 or 1+0?
I see this happen quite often when it comes to nested RAID arrays. Manufacturers of RAID systems don’t help much as they quite often mix things up too. This usually comes up when comparing performance or fault tolerance of RAID 5 or 6 to RAID 10. The proponents of RAID 5 or 6 will typically, and many times unknowingly, substitute RAID 0+1 for 10. When it comes to nested RAID arrays the first number is applied first when building the array. The second number is then applied to the existing arrays. Assuming 6 disks for simplicity a RAID 0+1 would start out with 2 RAID 0 arrays with 3 disks in each array. Then the two RAID 0 arrays would be mirrored in a single RAID 1 array.
A RAID 10 (1+0) array starts by creating three RAID 1 arrays (2 disks in each array) and then striping them into a single RAID 0 array.
So what’s the difference? Fault tolerance among other things. Let’s assume a single disks fails (disk 0) in each of the arrays. Both arrays still operate. Now what is the fault tolerance of the partially failed array? For RAID 0+1 failure of disk 0 would cause Stripe A to fail but since Stripe B is a mirror of Stripe A we would be ok. If disks 3, 4 or 5 fail then Stripe B would fail and since Stripe A is already down the entire array would fail. So there is a 3 in 5 chance (60%) that a second disk failure would be fatal.
For RAID 10 a failure of disk 1 would cause Mirror A to fail and since all the mirrors are striped, the entire array would fail. But if any other disk fails then the array would continue to operate. So there is a 1 in 5 chance (20%) that a second drive failure would be fatal.
Now that example was for illustration purposes. What happens when the number of disks is increased to a more realistic level; let’s say 10 disks. The RAID 0+1 array would have Stripe A consisting of disks 0-4 and Stripe B consisting of disks 5-9 that are then mirrored. Once again we start with disk 0 failed. In this case Stripe A is failed and any drive failure in Stripe B will bring down the whole array. The probability would be 5 in 9, or a 56% chance that a second disk failure would be fatal.
The RAID 10 array would consist of 5 RAID 1 arrays consisting of 2 disks each (disks 0-1, disks 2-3, disks 4-5, disks 6-7, disks 8-9) that are then Striped. Starting with disk 0 failed the array could handle a disk failure in any Mirror except Mirror A. So any disk can fail except disk 2. The probability then would be 1 in 9, or an 11% chance that a second drive failure would be fatal.
But what good is comparing RAID 0+1 to RAID 10? None; I don’t think anyone would implement a RAID 0+1 in production. Normally the choices are RAID 5, RAID 6 or RAID 10 with RAID 50 and 60 gaining a little popularity. Choosing the right RAID level is an involved process that has to take into account much more than just fault tolerance; and you do have a bullet proof off-site backup process in place, right?