
RAID50 vs. RAID10
Hello colleagues,
the topic of RAID is probably a never-ending one. Especially now, when the storage strategy has to be adapted in times of advanced virtualization technology and thus larger SAN systems are also finding their way into small and medium-sized businesses, many admins are rethinking their optimal RAID network. In contrast to the past, the arrays are now much larger, and there is also the problem of fast data transmission over the network. We will discuss the principle of dedicated storage networks and parallel data paths another time.
The selection of the RAID setup is the first magic point, which should be carefully prepared, because subsequent changes are difficult to make. While many storage systems have a function to provide certain RAID levels in
But how can you draw conclusions about the right RAID setup from the requirements of your own systems that are difficult to predict?
Roughly speaking, it’s quite simple:
You examine the requirements for performance and space requirements and then make a decision with the help of the following table of the most important RAID levels, which is probably halfway familiar to everyone.
RAID 0 | RAID 1 | RAID 5 | RAID 10 | RAID 50 | |
Principle | Stripes over 2 slots | A panel mirrored | Distributed parity | Stripes over n plates, mirrored | Stripes over n RAID5 |
Minimum number of plates | 2 | 2 | 3 | 4 | 6 |
Data integrity | none | 1 Failure | 1 Failure | 1 failure per subarray | 1 failure per subarray |
Capacity | 100% | 50% | 67% – 94% | 50% | 67% – 94% |
Performance | Write quickly | read quickly | medium | fast | Fast with large amounts of data |
Rebuild Performance | – | good | medium | very good | medium |
Advantage | 100% capacity / performance | Data Security / Write Performance | Data security / high capacity | Scalability / Performance | Scalability / Capacity |
Large RAID setups
This has sometimes worked quite well in times of “baremetal computers” that only had to supply themselves with storage. However, if you have to operate a VM-Ware cluster with three ESX servers and a total of 50 virtual machines, for example, then this information base is too poor.
We anticipate here that we quickly came to the conclusion that we should take a closer look at RAID levels 10 and 50, as they are the only ones that offer high performance, reliability and scalability at the same time.
Not all performance is the same
It is not so easy to measure storage performance and characterize it with simple figures. As a rule, a medium-sized amount of data is simply written or read sequentially to the store to be tested. This can usually be implemented quite easily and the result is then numbers such as 73MB/s write etc. . If, for example, large contiguous amounts of data are moved, this is already helpful information. However, other questions remain unanswered for the time being. Which features of a storage system, for example, allow a Windows domain controller to boot in no time at all or help to speed up a relational database? The decisive factor for this is a criterion that is difficult to measure, called transactions per second (tps). In other words, the number of write or read accesses per unit of time. The tps of a system can only be determined by sophisticated benchmarks, as this requires a test environment that can start an extremely large number of small and smallest I/O requests in parallel. However, this is much closer to the “natural” I/O load for the average server system than the usual read and write rate of sequential data.
Suitable benchmarks can be found, for example, in the open source world. We use a free tool called FIO, among others. to determine the TPS values. We have had good experiences with another program called Bonni on smaller storage systems. For tests on particularly fast systems, e.g. when SSD disks are used for caches, we had to switch to FIO. We will discuss the exact functionality of benchmark programs and the interpretation of the results in more detail in a separate blog article.
What criteria are decisive for high TPS values?
- The number of hard drives used.
Logically, the more hard drives (also called spindles) are working at the same time, the more work can be done. This is especially true for TPS, as a conventional hard drive needs some time to reposition the read head due to its design. With many small accesses at “different” places on the disk, there are many small reading pauses while the head moves to the next destination. - The RAID setup.
The RAID level determines what work the controller will do and how hard drives can work in parallel. Logically, more disks are actively involved in a write access in a RAID 50 array than in a RAID10, for example. In the case of the latter, this should result in a performance gain. - Performance and rotation speed of the hard drives used.
The faster the data disk of the hard drive rotates, the less time passes until the “searched data” is under the read head. Especially after repositioning the head, it is necessary to wait.
In practice
So much for the theory. Now we want to know whether we should better use RAID10 or RAID50 for our VMWaren cluster and how we can improve the performance.
Setup
- VM-Ware cluster with three ESX servers (96GB RAM each) and a total of 50 virtual machines
- Dell Equallogic PS4000 with 12 hard drives of 15,000 rpm each
- Dell Equallogic PS4000 with 16 hard drives 7,200 rpm each
- A Linux VM with FIO benchmark software.
Results (detail)
EQL1 (10×15,000rpm) | EQL2 (14x7200rpm) | ||
RAID 10 | RAID 10 | RAID 50 | |
Read-TPS 4KB | 2161 tps | 1406 TPS | 1390 tps |
Read / Write – TPS 4KB | 1360 tps | 800 tps | 595 tps |
In direct comparison on EQL2, RAID10 and RAID50 face each other on the same hardware.
In addition, two RAID10 setups on very similar hardware, but with different hard disk configurations, are compared.
Result:
As expected, RAID10, as the “most expensive” RAID, is also the fastest with TPS. However, the lead is not very large (between 5% and 25% faster than RAID50) and there is another surprise.
The design of the hard drives, especially the rotation speed, seems to be decisive for the performance at the end of the day, even in large networks. According to our results, it is reasonable to assume that doubling the rotation speed also roughly doubles the number of transactions. (If you extrapolate the number of hard drives and neglect differences in the rest of the hardware).
RAID 10 | RAID 50 | ||
Advantages | Best TPS values, fast rebuild, little performance loss with degraded arrays, slightly better redundancy | Best use of storage capacity (cheaper), decent TPS values, small advantage when reading large amounts of data | |
Disadvantages | Only 50% of the storage capacity can be used (more expensive) | Performance loss on degraded arrays and during rebuild, slightly less redundancy | |
Purpose | Fast databases, storage for systems (virtual and physical) with maximum performance requirements | Large databases, storage for (virtual and physical) computer systems with normal requirements, backup, file servers |
Based on these measurement results, we decided to operate two different storage systems for our VMWare cluster . A storage system with a lot of storage capacity based on normal-fast disks and RAID50 and a slightly smaller system with RAID10 and fast disks. This allows us to move every virtual machine and all other systems with storage space requirements, sorted according to requirements, to the individually suitable storage.
Hannes Wilhelm
WorNet AG 2012
Sources: