Continuing SQL and RAID, let’s pick up RAID 5 and RAID 1+0 (or just 10) This content is still for the first skill measured in the 70-450, per the official Microsoft page for the 450:
Designing a SQL Server Instance and a Database Solution (14 percent)
Design for CPU, memory, and storage capacity requirements.
- This objective may include but is not limited to: RAID, calculating table size, IO throughput, transaction per second, data compression, non-uniform memory access (NUMA), tempdb capacity
So let’s jump right back in!
In RAID 5, the data is striped across all the discs similar to RAID 0, but with RAID 5, you also get a parity bit. This parity bit can be used to reconstruct missing data, in cases of drive failure. That way you get a speed boost on your read and writes, but you also get security from the parity bit.
RAID 5 optimizes reads and writes. Since your writes are split across all those disks, you get a speed boost. Reads and writes are (n-1) times faster than reading or writing to a single disk. N is the number of drives. So, in the example to the right, reads and writes are 3 times faster than to a single disk.
You effectively lose one disk’s worth of space in RAID 5. The amount of space needed for parity data is 1/n. So in our above diagram, we lose 1/4 of the space to parity storage. But, by giving up this drive space, you are gaining the ability to stand up to a single drive failure. When (not if) a drive failure occurs, your controller can use the remaining data to compute the missing data. There is a cost to computing the missing data, so your reads and writes are going to slow down. You’re going to want to replace the bad drive as soon as possible.
Raid 5 requires at least 3 drives. Yes, you’ll have to have 3 drives to create your RAID 5, The more drives you add, the faster the reads and writes become, But, the more drives you add, the greater your chances of a drive failure occurring. That’s just how it is, the more complex the system, the more chance for failure.
RAID 10 is a RAID 0 of RAID 1s. Data is then striped across a collection of RAID1s. Data is Striped and mirrored, so you get a huge read increase, and you do get a pretty good write increase, but this set up does give better read performance than write performance.
You lose half your space to the mirrored copy. But as a result, you could lose one copy of every mirror set, and still be able to read data off your RAID. As long as you don’t lose both member disks in a RAID 1 set, you’re OK. You don’t want to leave a bad disk in your set for long, you wouldn’t want to risk losing the second disk in a mirror.
RAID 10 requires 4 disks to set up. And considering 1/2 the disk space is spent on mirrors it can prove expensive. But with the rise in availability of SANs, the difference in costs between leasing RAID5 space and RAID10 space will be pennies. If you have very high transaction volume, and you have a very low tolerance for drive failure, RAID 10 is the choice to beat!
RAID 5 vs RAID 10
So when would I chose RAID 5 over RAID 10? If the database I’m implementing will be significantly heavier on writes than on reads. In RAID 5 you can write to each disk at the same time, in RAID 10, you only write to half the discs at the same time, then the RAID controller writes the mirrored copy. Now, with some really nice SANs, that write penalty on RAID 10 is masked by having a ton (highly specialized measurement) of cache on the SAN. But generally I’d chose a RAID 5 for write intensive databases.
Or if my client is cost conscious, then I might chose RAID 5 over RAID 10. If I’m building out a server, and that server is going to use direct attached storage, RAID 10 would cost 2x the storage space I want, while RAID 5 would only cost me 1/n more.
I’m going to hit you with a handful of scenarios requiring you to choose one or more raid solutions. It’ll be up to you to choose the best one given the information. If you have any questions before then, let me know. I’m here to help!