RAID50 vs. RAID10

8. October 2012

Hello colleagues,

the topic of RAID is probably a never-ending one. Especially now, when the storage strategy has to be adapted in times of advanced virtualization technology and thus larger SAN systems are also finding their way into small and medium-sized businesses, many admins are rethinking their optimal RAID network. In contrast to the past, the arrays are now much larger, and there is also the problem of fast data transmission over the network. We will discuss the principle of dedicated storage networks and parallel data paths another time.
The selection of the RAID setup is the first magic point, which should be carefully prepared, because subsequent changes are difficult to make. While many storage systems have a function to provide certain RAID levels in others without losing data or operations, but in practice the required combination is often not possible or the live migration would put a significant strain on the entire system for days, so that daily operations could be impaired. The selection of the RAID level should therefore be based on solid considerations.

But how can you draw conclusions about the right RAID setup from the requirements of your own systems that are difficult to predict?

Roughly speaking, it’s quite simple:
You examine the requirements for performance and space requirements and then make a decision with the help of the following table of the most important RAID levels, which is probably halfway familiar to everyone.

	RAID 0	RAID 1	RAID 5	RAID 10	RAID 50
Principle	Stripes over 2 slots	A panel mirrored	Distributed parity	Stripes over n plates, mirrored	Stripes over n RAID5
Minimum number of plates	2	2	3	4	6
Data integrity	none	1 Failure	1 Failure	1 failure per subarray	1 failure per subarray
Capacity	100%	50%	67% – 94%	50%	67% – 94%
Performance	Write quickly	read quickly	medium	fast	Fast with large amounts of data
Rebuild Performance	–	good	medium	very good	medium
Advantage	100% capacity / performance	Data Security / Write Performance	Data security / high capacity	Scalability / Performance	Scalability / Capacity

Large RAID setups

This has sometimes worked quite well in times of “baremetal computers” that only had to supply themselves with storage. However, if you have to operate a VM-Ware cluster with three ESX servers and a total of 50 virtual machines, for example, then this information base is too poor.
We anticipate here that we quickly came to the conclusion that we should take a closer look at RAID levels 10 and 50, as they are the only ones that offer high performance, reliability and scalability at the same time.

Not all performance is the same

It is not so easy to measure storage performance and characterize it with simple figures. As a rule, a medium-sized amount of data is simply written or read sequentially to the store to be tested. This can usually be implemented quite easily and the result is then numbers such as 73MB/s write etc. . If, for example, large contiguous amounts of data are moved, this is already helpful information. However, other questions remain unanswered for the time being. Which features of a storage system, for example, allow a Windows domain controller to boot in no time at all or help to speed up a relational database? The decisive factor for this is a criterion that is difficult to measure, called transactions per second (tps). In other words, the number of write or read accesses per unit of time. The tps of a system can only be determined by sophisticated benchmarks, as this requires a test environment that can start an extremely large number of small and smallest I/O requests in parallel. However, this is much closer to the “natural” I/O load for the average server system than the usual read and write rate of sequential data.

Suitable benchmarks can be found, for example, in the open source world. We use a free tool called FIO, among others. to determine the TPS values. We have had good experiences with another program called Bonni on smaller storage systems. For tests on particularly fast systems, e.g. when SSD disks are used for caches, we had to switch to FIO. We will discuss the exact functionality of benchmark programs and the interpretation of the results in more detail in a separate blog article.

What criteria are decisive for high TPS values?

The number of hard drives used.
Logically, the more hard drives (also called spindles) are working at the same time, the more work can be done. This is especially true for TPS, as a conventional hard drive needs some time to reposition the read head due to its design. With many small accesses at “different” places on the disk, there are many small reading pauses while the head moves to the next destination.
The RAID setup.
The RAID level determines what work the controller will do and how hard drives can work in parallel. Logically, more disks are actively involved in a write access in a RAID 50 array than in a RAID10, for example. In the case of the latter, this should result in a performance gain.
Performance and rotation speed of the hard drives used.
The faster the data disk of the hard drive rotates, the less time passes until the “searched data” is under the read head. Especially after repositioning the head, it is necessary to wait.

In practice

So much for the theory. Now we want to know whether we should better use RAID10 or RAID50 for our VMWaren cluster and how we can improve the performance.

Setup

VM-Ware cluster with three ESX servers (96GB RAM each) and a total of 50 virtual machines
Dell Equallogic PS4000 with 12 hard drives of 15,000 rpm each
Dell Equallogic PS4000 with 16 hard drives 7,200 rpm each
A Linux VM with FIO benchmark software.

Results (detail)

	EQL1 (10×15,000rpm)	EQL2 (14x7200rpm)
	RAID 10	RAID 10	RAID 50
Read-TPS 4KB	2161 tps	1406 TPS	1390 tps
Read / Write – TPS 4KB	1360 tps	800 tps	595 tps

In direct comparison on EQL2, RAID10 and RAID50 face each other on the same hardware.
In addition, two RAID10 setups on very similar hardware, but with different hard disk configurations, are compared.

Result:

As expected, RAID10, as the “most expensive” RAID, is also the fastest with TPS. However, the lead is not very large (between 5% and 25% faster than RAID50) and there is another surprise.
The design of the hard drives, especially the rotation speed, seems to be decisive for the performance at the end of the day, even in large networks. According to our results, it is reasonable to assume that doubling the rotation speed also roughly doubles the number of transactions. (If you extrapolate the number of hard drives and neglect differences in the rest of the hardware).

	RAID 10	RAID 50
Advantages	Best TPS values, fast rebuild, little performance loss with degraded arrays, slightly better redundancy	Best use of storage capacity (cheaper), decent TPS values, small advantage when reading large amounts of data
Disadvantages	Only 50% of the storage capacity can be used (more expensive)	Performance loss on degraded arrays and during rebuild, slightly less redundancy
Purpose	Fast databases, storage for systems (virtual and physical) with maximum performance requirements	Large databases, storage for (virtual and physical) computer systems with normal requirements, backup, file servers

Based on these measurement results, we decided to operate two different storage systems for our VMWare cluster . A storage system with a lot of storage capacity based on normal-fast disks and RAID50 and a slightly smaller system with RAID10 and fast disks. This allows us to move every virtual machine and all other systems with storage space requirements, sorted according to requirements, to the individually suitable storage.

IT remains exciting!

Hannes Wilhelm
WorNet AG 2012

Sources:

SecuMail-Blog

RAID50 vs. RAID10

Large RAID setups

Legal information

Cookies