A Redundant Array of Independent Drives (or Disks), also known as Redundant Array of Inexpensive Drives (or Disks) (RAID) is an term for data storage schemes that divide and/or replicate data among multiple hard drives. RAID can be designed to provide increased data reliability or increased I/O performance, though one goal may compromise the other. There are 10 RAID level. But which one is recommended for data safety and performance considering that hard drives are commodity priced?
I did some research in last few months and based upon my experince I started to use RAID10 for both Vmware / XEN Virtualization and database servers. A few MS-Exchange and Oracle admins also recommended RAID 10 for both safety and performance over RAID 5.
Quick RAID 10 overview (raid 10 explained)
RAID 10 = Combining features of RAID 0 + RAID 1. It provides optimization for fault tolerance.
RAID 0 helps to increase performance by striping volume data across multiple disk drives.
RAID 1 provides disk mirroring which duplicates your data.
In some cases, RAID 10 offers faster data reads and writes than RAID 5 because it does not need to manage parity.
RAID 5 vs RAID 10
From Art S. Kagel research findings:
If a drive costs $1000US (and most are far less expensive than that) then switching from a 4 pair RAID10 array to a 5 drive RAID5 array will save 3 drives or $3000US. What is the cost of overtime, wear and tear on the technicians, DBAs, managers, and customers of even a recovery scare? What is the cost of reduced performance and possibly reduced customer satisfaction? Finally what is the cost of lost business if data is unrecoverable? I maintain that the drives are FAR cheaper! Hence my mantra:
NO RAID5! NO RAID5! NO RAID5! NO RAID5! NO RAID5! NO RAID5! NO RAID5!
Is RAID 5 Really a Bargain?
Cary Millsap, manager of Hotsos LLC and the editor of Hotsos Journal found the following facts - Is RAID 5 Really a Bargain?":
- RAID 5 costs more for write-intensive applications than RAID 1.
- RAID 5 is less outage resilient than RAID 1.
- RAID 5 suffers massive performance degradation during partial outage.
- RAID 5 is less architecturally flexible than RAID 1.
- Correcting RAID 5 performance problems can be very expensive.
My practical experience with RAID arrays configuration
To make picture clear, I'm putting RAID 10 vs RAID 5 configuration for high-load database, Vmware / Xen servers, mail servers, MS - Exchange mail server etc:
| RAID Level | Total array capacity | Fault tolerance | Read speed | Write speed |
| RAID-10 500GB x 4 disks | 1000 GB | 1 disk | 4X | 2X |
| RAID-5 500GB x 3 disks | 1000 GB | 1 disk | 2X | Speed of a RAID 5 depends upon the controller implementation |
You can clearly see RAID 10 outperforms RAID 5 at fraction of cost in terms of read and write operations.
A note about backup
Any RAID level will not protect you from multiple disk failures. While one disk is off line for any reason, your disk array is not fully redundant. Therefore, old good tape backups are always recommended.
Please add your thoughts and experience in the comments below.
Further readings:
- Battle Against Any Raid Five initiative - A website dedicated to RAID related issues.
- RAID article from the wikipedia - Provides tons of information about both standard and non standard RAID levels.
You should follow me on twitter here or grab rss feed to keep track of new changes.
Featured Articles:
- 30 Handy Bash Shell Aliases For Linux / Unix / Mac OS X
- Top 30 Nmap Command Examples For Sys/Network Admins
- 25 PHP Security Best Practices For Sys Admins
- 20 Linux System Monitoring Tools Every SysAdmin Should Know
- 20 Linux Server Hardening Security Tips
- Linux: 20 Iptables Examples For New SysAdmins
- Top 20 OpenSSH Server Best Security Practices
- Top 20 Nginx WebServer Best Security Practices
- 20 Examples: Make Sure Unix / Linux Configuration Files Are Free From Syntax Errors
- 15 Greatest Open Source Terminal Applications Of 2012

- My 10 UNIX Command Line Mistakes
- Top 10 Open Source Web-Based Project Management Software
- Top 5 Email Client For Linux, Mac OS X, and Windows Users
- The Novice Guide To Buying A Linux Laptop













{ 71 comments… read them below or add one }
after years of experience, i avoid raid 5 at all costs.
the only level i dislike more is basic raid 0 configurations.
the cost per gig of drives is so cheap today, that i don’t see a reason to use less than raid 10 if you’re combining multiple disks. otherwise stick with raid 1.
if you’re going to spend the money for 3 drives, build a raid 1 with a hot spare. that way you’re covered in the event of a failure of one drive and you can replace it when convenient.
While the other reply is very old…if I ran into this…other people still are, too. RAID 5 is not “less than” RAID 10, which incorrectly implies that the higher number is better, instead of demonstrating that it’s a combination RAID set.
In case I’m not being clear, RAID 10 is 1+0, not “ten”. This distinction is important for two reasons: 1) RAID 01 isn’t RAID 1, which is what we’d get pronouncing it as a number, and 2) if we imply that higher is better, RAID 50 should be even better still…when it’s just a different way of doing things. This just isn’t what “5″ vs “10″ vs “50″ means.
For the relevant discussion:
* RAID 01 = a mirrored stripe set
* RAID 10 = a striped mirror set
and…not to distract from the point, the general consensus is that RAID 1+0 is a good choice today.
Why waste everyone else’s time dismantling other people’s correct usage!
We all KNOW that 10 is just a shortcut for saying 1+0/0+1. No one has EVER said that 10 or 50 are implying “better” just because they are a higher number.
This is just in YOUR little head my friend, as we all know that already, so you’re wasting everyone’s time here by adding stupid non-issue points like that. Please go away and annoy the thicko brigade somewhere else.
I came here to better understand RAID 10 and this page really explains it well.
Now, I got a bit lost. Is RAID 10 the same as RAID 01? Does a mirrored stripe offer better performance than a striped mirror?? or is it just word play?????
I intend to use 4 X 2TB SATA II disks. I need a total of 4TB with mirrored copy. Which setup do you recommend RAID 01 or RAID 10 ??
thanks
@ccj,
Nice. These were my original sources:
* Why is RAID 1+0 better than RAID 0+1?
“[RAID 0+1] is not as robust as RAID 10 and cannot tolerate two simultaneous disk failures.”:
* http://en.wikipedia.org/wiki/Nested_RAID_levels#Six-drive_RAID_0.2B1
Cheers.
Posted by Admin on request.
Kirk got that right.
To repeat the point:
“RAID 10 = a striped mirror set”
It is RAID 1+0. The controller makes two RAID-1 (mirror set) sets then make a RAID-0 from that two. Therefore a “1″ followed by a “0″. That also means that a minimum of 4 disks are required for RAID-10.
As a comparison, a minimum of 3 disks are required for RAID-5.
Hi ccj. I actually appreciated the confirmation that Kirk wrote, but I am a novice as I am just learning about RAID terminology. It was nice to have the clarification.
While I have your attention, can you talk to the following question I have: What do you think about the RAID hardware (or is it software?) that is part of the LAcie2big and the WD MyBook Studio II products? Are using those products as the mirrored sets of a software stripped RAID set a good idea? I have xserves running 10.5 and would like to create a raid 0+1, with the 0 part being managed by the OS and the mirrored part handled by the drive units themselves. Thanks in advance!
I like RAID -10 joking I like RAID 0 + RAID 1 simply far more effective than all of the RAID combined. I personally felt RAID 5 was better but it has to maintain parity which makes its diffcult to maintain.
Thanks for the information Kirk, Since I’m not a “tecq” I greatly appreciate the information that you were kind enough to share and the clarity that you helped provide.
I do appreciate the explanation and distinction whereas some others did not. I think it is important to have the terms explained, for someone like myself just getting into RAID configurations on servers.
The article was short, but helpful. I appreciate you taking the time to post this information.
It looks that with increasing HDD capacities RAID 5 will be not able to provide data safety…
Very good article: Why RAID 5 stops working in 2009 at blogs.zdnet.com/storage/?p=162&tag=nl.e539
“With a 7 drive RAID 5 disk failure, you’ll have 6 remaining 2 TB drives. As the RAID controller is busily reading through those 6 disks to reconstruct the data from the failed drive, it is almost certain it will see an URE.
So the read fails. And when that happens, you are one unhappy camper. The message “we can’t read this RAID volume” travels up the chain of command until an error message is presented on the screen. 12 TB of your carefully protected – you thought! – data is gone. Oh, you didn’t back it up to tape? Bummer!
So now what?
The obvious answer, and the one that storage marketers have begun trumpeting, is RAID 6, which protects your data against 2 failures. Which is all well and good, until you consider this: as drives increase in size, any drive failure will always be accompanied by a read error. So RAID 6 will give you no more protection than RAID 5 does now, but you’ll pay more anyway for extra disk capacity and slower write performance.”
@Tomas M,
Nice find.
@shawn,
Good advice, this can be useful for webservers.
The ZDnet article is iffy at best. The statistical math is wonky, and the numbers are a worst case scenario if and ONLY if you max out the size of the array.
Besides, if you’re using SATA in the Enterprise, you deserve the high failure rate.
As to the comments about hard drive space being cheap – please share with me where you’re getting cheap 500G SAS or SCSI hds.
VonSkippy, you do know that SCSI drives are obsolete?
@David
no, scsi drives are not obsolete, the new standard is SAS which is Serial Attached SCSI. SCSI itself has not been depreciated, instead the connector has. I would also like to know where people are getting these “cheap” hard drives (600GB SAS 15K RPM Drives). As far as I know, the price range is around $650 each. A Raid 5 array starts at almost 2k, not cheap in my book…
@John
Regarding sas 600GB 15K new !
€142 & €129
http://tinyurl.com/3dvob6j
http://tinyurl.com/3uxnx7c
X
As to the comments about hard drive space being cheap – please share with me where you’re getting cheap 500G SAS or SCSI hds.
I agree with you but if you purchase large in bulk you may get a good discount.
Besides, if you’re using SATA in the Enterprise, you deserve the high failure rate.
This post covers up SATA vs SCSI / SAS issue nicely.
VonSkippy,
See this paper, the authors conclusions:
I do use SCSI or SAS for all enterprise servers for speed with high-RPM spindles and cache.
However, I do have several terabytes of SATA storage for archives.
Hope this helps!
i agree that sas is a better option, albeit more expensive, but that said; i run multiple low-mid range dell servers on sata 500gig raid 1′s and they’re perfectly fine.
i’d also like to point out that the linked article about sata references 150′s and not 3.0′s/sata ii specs.
i’ve never had issue running sata ii’s with 16mb cache or higher (like 32′s now) in raid configurations; even for high end database systems.
is RAID-10 fault tolerance only 1 disk? shouldn’t it be 2 disk but not on the same raid 1 pair?
my comment is only based on the example. you can lose 1 disk on each raid 1 pair, but not both on the same pair.
You have completely overlooked the use of hot spare drives. RAID 5 is fantastic with a good hardware controller, especially with multiple global hot spares. There is also RAID 50 to consider. Thought provoking article. Thanks.
You also have overlooked Sun’s ZFS filesystem, which is a quantum leap over existing filesystems, and does not require a hardware RAID controller. ZFS maintains a checksum for each byte, and if necessary relocates data on a bad sector to a good sector. ZFS has a ZRAID function which is supposedly better than RAID5. ZFS will also allow you to mirror with three or more drives, so if one fails there are at least two remaining. It also allows snapshots – you can rollback to a known good state – plus tons of other features. ZFS is available on Solaris, opensolaris, Nexenta, and FreeBSD operating systems. Unfortunately it is not on Linux, yet, because ZFS is released under the CDDL. Also the FreeBSD/Nexenta ZFS versions lag behind the Solaris version. You can use any hardware the operating system recognizes, IDE/SATA/SCSI/SAS, but with all the motherboards out there with 6 SATA ports, a cheap and effective file server can be put together using ZFS and SATA drives. You could have two servers with this configuration for the price of one server using a hardware RAID/SAS configuration. Back the data up from one server to the other – eliminates the tape backup problems.
Obviously every case is different. RAID 5 will always give you better price/performance/disk space than RAID 10 for 90-100% READ profiles.
In general, I’ve found that you can get the best performance for your money if you use RAID 10 on bigger slightly slower disks than you can using RAID 5 on smaller faster disks.
Everything is explained here:
Link
Hmm, you make it sound like RAID-5 is the worst thing in the world.
Sure it’s not as safe as RAID-10, but it’s MUCH more price efficient.
Also, I’ve personally run over 10 different RAID-5 systems with all different kind of OS’s and HW’s and never had a complete failure. This is both in Windows & Linux with SCSI, IDE and S-ATA drive configurations.
I should mention that my systems has not been subject to highly intensive database transactions with huge amount of read/write so speed hasn’t been my concern.
Reading this article i just find a “tiny hint” that you’re blowing everything out of proportion =)
Although despite perhaps having to change 5 drives over 8 years and all has been successful to rebuild themselves. Is that just pure luck? Certainly sounds like it reading this…
Have a nice day and happy “RAIDing” out there =)
//MARTiN
Hi um how is raid 10 safer than raid 5? in raid 10 if your system and not the drives goes on you than all your drives could be corrupted by the system.
As someone who has suffered drive failures in both raid 5 & 10 i am suprised not to see any mention of a few other considerations.
Time to rebuild a 4 disk Raid 10 is about 5 to 10 times faster than rebuilding a Raid 5.
Mainly due to the parity calculations but also depends or controller algorithms and the fact that the Raid 5 may have 50% more data on anyway.
The cost of building a 2tb raid is always going to more expansive for Raid 10. (3 drives vs 4 drives) but with 1tb HDDs being relatively cheap (£70 in UK) you would have to value your data very low not to take the Raid 10 option. Then theres the ‘cost’ of your time to be considered.
before anyone dismisses Esata raids, they do perform very well (mine are in 90s), and when the motherboard dies you dont have to worry about losing all your raids because the new motherboard uses a different raid chip!!
…they dont require you to install drivers during windows installation…
…or the BIOS Raid setting….
….you can add them at any time without having to solve the [ide][raid] motherboard setting problems for those who want to add a raid AFTER installing Windows….
..and of course you can move your esata raid to another machine without worry.
Thanks for listening.
Steve,
I’m new to raids. But are you saying if i have a raid 10 and my motherboard fails i can simply install a new mother board and i’m good to go. Or are you saying i would still have to set up raid in the new motherboard and that it would automatically recognize the existing raid. Very curious about this. please let me know.
thanks for reading
What of RAID 6?
RAID 6 is harder to define. It’s a sort of marketing term launched by some RAID vendors, and thus there are differences in implementation from one RAID 6 implementation to another.
RAID 6 offers more redundancy than RAID 5 (which is absolutely essential, RAID 5 is a walking disaster) at the cost of multiple parity writes per data write. This means the performance will be typically worse (although it’s not theoretically much worse, since the parity operations are in parallel). Additionaly it is more complex, which nets a more complex and possibly more buggy implementation, and less flexilbility with management.
I’m not sure of the theoretical differences of RAID 6 vs RAID 5 with hot standby.
In general, RAID 6 has the same performance signature as RAID 5 with improved reliability but a higer hardware cost. You can also get improved reliability along with higher performance with RAID 10, which is what I would recommend instead.
Joshua, I will need some convincing that calculating and writing a second parity string (RAID6) instead of just 1 (RAID5) “has the same performance signature”.
Hi guys,
Lets agree that RAID-5 is not safe enough on paper, how does it fare in REAL life?
A few people have said here its good enough.
RAID-6 is absolutely misunderstood here. In a RAID-6 config you d have to lose 3 disks BEFORE your system went down. That just does not happen these days. I have never seen it in my 6 years of Storage Admin across enterprise customers, ranging from banks to telecoms to govt. orgs.
The whole “write performance hit” because of parity is also not quite as presented by some here. These days parity is constructed in NVRAM before being written to disk, its written at the SAME time as any other I/O. Hence if your disks and CPU are not challenged you wont suffer any perf. impact at all.
I am also bafffled why some would regard IO to RAID as a bottleneck when typically there are other bottlenecsk that are prevailing, such as network, SANs, CPU, misconfigurations.
Reconstruction next: of course reconstruction will take longer with RAID-DP. Some vendors have got DEDICATED parity disks that will reconstruct whilst the other disks are serving data. Hence the impact will once again be MINIMAL. It will be on CPU but hey, if your storage system is designed properly you are not CPU challenged = no perf. hit.
On IBM when a disk fails all disks inclusive those that serve data, will reconstruct and thus performance is known to drop 30-40%!! how expensive is that? So EACH time a disk fails you will have a cost.
Regards,
Eric
Well, it does, if someone steals your computer or the building catches fire. There’s not much point in making yourself absolutely protected against one threat (individual drive failures) as long as other threats (loss of all drives) exist that will walk past that protection. If you can make the risk to your data due to individual drive failures small compared with other risks then that’s enough.
RAID 5 is a mathematically elegant compromise that strikes me as a pretty good solution to that problem for many applications. It’s obviously a great deal safer than the current most prevalent solution in many contexts, which is a single hard drive, even if it’s not as secure as a less sophisticated scheme like RAID 10.
At least in theory. The actual performance, though, depends on the controller, and I don’t know enough about the choices there.
One advantage to Raid-10 is that if a drive does go down, you won’t notice a performance hit as you would with a Raid-5 while it has to rebuild the array.
Regards,
Rob
Raid-5 vs Raid-10
You can minimize the price difference and play it off, but the difference between RAID 5 and RAID 10 cost as storage space goes up gets pretty hairy.
C = Cost per disk
S = total storage area
s = size of individual disk
RAID 5 cost: C((S/s) + 1)
RAID 10 cost: C((S/s) * 2)
(it should be obvious that you would round up S/s)
If S is 2 TB, s is 146 GB, and C is $500, RAID 5 would cost $7,500. Equivalent RAID 10 would cost $14,000. That’s a pretty hefty price increase, especially when you’re on a budget that still has to pay for CPU’s/RAM/etc…
That said, I don’t run RAID 5 on production databases. I **do** run RAID 5 on reporting databases that handle massive read-only queries, but I can’t afford RAID 5 performance loss for a disk rebuild on a production machine, nor can I afford to take the chance at losing 2 disks holding critical data (whether I have backups or not).
As for safety, RAID 10 definitely has the edge. In the example 2 TB array above, if I chose RAID 5 and had one disk fail, after that every other disk failure gives 100% chance of array loss. If I chose RAID 10, it would have to be the mirror to the disk that already failed (only about a 4% chance, and you can further reduce the odds for RAID 10 failure by creating mirrored pairs with disks from different manufacturers). So play the odds: when a disk fails, you’re looking at 100% chance total loss if another fails, or 4% chance total loss if another fails? Take your pick.
It seems to me that, in theory and for large arrays and assuming the hard drives themselves are the bottlenecks, RAID 6 is going to be better than RAID 10 is every way: faster (less duplication needed), safer (it can guarantee survival after two drives die whereas RAID 10 can’t; it can even correct small errors on the disk) and more space-efficient (less duplication needed).
Whether this is achievable with extant controllers I don’t know, but if it isn’t I’d take it as evidence, not that it’s impossible, but that the market for that level of security isn’t large enough to make it worth developing the hardware. The standard implementation of RAID 5 may not suit every need but the general approach of using parity vice duplication is in principle sound.
Comparisons that neglect the price of hard drives are probably unsound, because by adding another half dozen hard drives you can always make the system safer yet.
The different approaches, etc. all represent three-dimensional surfaces of compromise in a discrete four-dimensional space of safety, speed, space and budget. Except for quite small arrays, the hyperplanes of the parity-based solutions are going to look more attractive than those of more primitive schemes like RAID 10.
> RAID 6 is going to be better than RAID 10 is every way
I use RAID 6 for many applications, but I would still avoid it for swap files, databases, maildirs, etc.
The possibility of a 2nd failure in the same mirror sets makes me reluctant to use RAID10.
Does the stress on a drive caused by rebuilding a mirror set make it more likely for a 2nd disk to fail? I wonder if anyone has any actual data on failure rates of RAID10.
RAID 6 seems like the best compromise to me, and I hope that modern hardware controllers with large cache will mitigate the performance issues.
RAID 10 is certainly worth it depending on the context and performance of your data. In fact it may save you money especially when you consider the performance degradation associated with RAID 5 and high random read / write IO databases. Another consideration is to look at host based striping at the same time. The key factor is that the storage system and the host should be sharing the load as much as possible. For more details, please feel free to comment and post on my article
When RAID 10 Is Worth The Economic Cost
RAID 10 is certainly worth it depending on the context and performance of your data. In fact it may save you money especially when you consider the performance degradation associated with RAID 5 and high random read / write IO databases such as Exchange. Another consideration is to look at host based striping at the same time. The key factor is that the storage system and the host should be sharing the load as much as possible. For more details, please feel free to comment and post on my article
When RAID 10 Is Worth The Economic Cost Link
“RAID 6 is going to be better than RAID 10 is every way: faster (less duplication needed)”
Parity calculations is what kills RAID 5 and RAID 6 for write performance, and unless your workload is a) read-only and/or b) highly sequential RAID 10 will outperform RAID 5 or RAID 6…particularly in a random read-write scenario RAID 10 easily outperforms RAID 5 and RAID 6.
Link #1 (ibm.com) and Link #2 (dell.com)
You’ll notice that media streaming or database logs (highly sequential) is where RAID 5 and RAID 6 shine, being outperformed only by RAID 0 on those tests. In virtually every other test RAID 10 has better performance.
Mirror sets are wonderful when you have a hardware failure that DOESN’T cause some type of corruption in the data structure – however I have more often than not seen that the result is that both mirrors end up with problems, as well as one of the mirrors simply being dead. So I basically head towards Raid 5 (now 6 with the larger capacity issues) and then backups as well. If you of course have tens of thousands to spend then SAS is the way to go and don’t forget to replicate that information to another expensive array and then take intermittent snapshots as well. It really depends on what you are protecting doesn’t it?
“Parity calculations is what kills RAID 5 and RAID 6 for write performance”
This might be true for some controllers, or even for all current controllers, but it can’t be a long-term fundamental problem. A parity calculation doesn’t require many operations and just shouldn’t take very long, compared with hard drive write time. If it is taking a long time it’s because it’s being done in software by the computer or sequentially with the hard drive controller or there’s no buffering or something equally unfortunate. If RAID6 becomes at all common this problem ought to disappear.
I have been in the IT industry for 12 years now. Have used off the shelf to high end raid controllers and never have I had a raid 5 totally die on me. I can’t believe the slam raid 5 is getting. Have people really experienced that many problems with raid 5 to say it’s the worst thing out there? Or are people just reading white papers and basing their opinions on that?
prove it…..
Where are the benchmarks? Oh, you got the opinion of some person in the field, that doesn’t prove anything. I’m in the field and I could say anything I wanted and you would have to take my word for it, that’s… not… a practical comparison especially not a scientific comparison!
Proof is:
Benchmarks
Statistical analysis
Where are the benchmarks of performance?????
Where are is the huge statistics of drive failures?????
This blog is as good as useless! Sorry for being so critical but I wouldn’t trust the “word” of anyone without there being good proof.
I have been using RAID 5 on multiple servers for many years. Have had several hard drives fail but never a failure of the RAID 5. Simply replace the failed drive and keep on going. Can’t comment on the performance comparison with RAID 10 except to say I would need to see actual field test data.
I too can’t believe the negativity towards RAID 5 on here. Yes, RAID 1+0 is great for small data requirements, but if you use enterprise class SAS disks, the cost of implementing and maintaining large amounts of data storage far outweighs the risk associated with RAID 5 or 6 in nearly all situations.
Yesterday we had a Raid 10 breakdown with 4 drives dead within a few minutes (BTW those where WD). Data was gone but Tape backup was our safety.
There is no perfect Raid Solution on earth. It all depends on the value of the data. If your data is not important or you simply don´t care. Take whatever you think suits your budget.
If your data is important than take Raid 10 and don´t care about the extra money you spend. And if you ask me Raid 10 is cheap. Much cheaper then the Data Recovery Service you need if something goes wrong.
Never forget your tape backup.
If I am reading this correct, then this article says that a raid 1+0 array offer 4x proformance, when you only have 4 drives. You cannot compare a 3-drive raid 5 to an 8 drive raid 1+0. If it is a 4 drive raid 1+0, then it only offers a 2x read speed increase, and this article is incorrect. Please get your facts right, or don’t compare the speeds of something when you have more then twice the number of drives!
I think you are mistaken regarding performance of a 4 disk RAID 10. The performance for a read would indeed be up to 4x. For example, if the array is organized as two 2-disk RAID1′s striped together in RAID0 then all the disks can work simultaneously to retrieve a file. Correct me if I’m wrong. I think you are assuming the speed of RAID1 is only 1x for reads, but either member of the mirrored pair can furnish the data, so this is not true; each drive only has to supply half the requested file.
As regards the general question of RAID 5 versus RAID 10, I agree with the original article. In about 7 years of experience with RAID, I’ve had two RAID 5 failures which were extremely problematic or costly to fix, so I don’t trust it any more. Personally, I’m going with RAID 10 from now on.
Christian hit the nail on the head that there is no perfect RAID solution and Veral’s point about the author comparing apples to oranges is a good one, too. Steve is a moron and clearly has never installed an OS on a RAID array if he thinks you can magically bypass drivers just by changing RAID types. Drivers drive the RAID controller and are NOT RAID level specific, but thanks for the laugh at your expense.
As has been said many times in here your specific requirement is what will drive your decision and whether they like it or not, RAID 5 is the most commonly used RAID level in the SMB space. I personally use the RAID 1 / RAID 5 combo that was mentioned earlier for most server installs. As I am an IT consultant with over 10 years experience in the SMB space, I have found this to be more than adequate as I have yet to see anyone that truly has the “high write volume” that they might think they have which would illuminate any performance gain for any other RAID level. I am sure the haters will say “oh you don’t deal with big enough clients” but to them I can only say, keep smokin the happy pipe and telling yourself you know what you are talking about but don’t be surprised when your clients detect your incompetence.
I have also personally experienced failures on all different types of RAID arrays and the only one that ever completely took the server down was when we lost 1 drive on a RAID 10 system and not 12 hours later lost the 2nd one before the first replacement drive arrived. With that being said, don’t be a moron; do your backups. It was just the luck of the draw but the fact is that eventually every single drive you ever buy will fail so it’s YOUR job to get the bang for your buck and make sure you get some insurance (i.e. good backups). Oh and speaking of morons, all you people who are still suggesting tape as a “viable” backup solution you make me ROFL. Get a clue and learn to let go of your antiquated, failure-laden backup technology.
Hi Expert, would it be alright if you could suggest on the “good backup” that you’ve mentioned? As in what is the best way to insure your data, other than tape suggested by others?
For SMB? -External drive(s) running a differential backup with a 3-revision rule generally fit the criteria. Online storage as a supplement makes this a rock-solid choice for most environments.
Hi I was reading all the info here and I appreciate many analytical minds working hard keep it up. My issue is I am building a new high end home personal system and only have experience with raid 0, works great until a drive dies. I want to use 4 maby 5 Western Digital Black 2 terabyte 7200rpm drives, my goal first is performance and than also protection against at least 1 drive failing 2 would be much better though. It seems that the options of raid that I have found info on mostly all force me to take a hit on write speed and efficiency not to mention the unavoidable reduction of storage space but that is better than loosing my data. So many authors of articles I have read from several websites contradict each other and it leaves me wondering who to believe. In real world observable conditions, which raid solution will give me the most performance (I’m thinking fast write speeds), while protecting against total data loss being able to at least temporarily tolerate the loss of up to 2 drives. Less than 50 percent overhead is important too as I can only purchase and install so many drives and I need to have as much storage space as possible. Or am I dreaming and the raid solution I am looking for has yet to be developed? I welcome any thoughts on this matter thanx for reading my post.
My company has done well over 100, probably closer to 200 Raid 5 implementations, at least another 30ish Raid 1′s. I’m now starting to put in Raid 10 arrays for the virutal hosts. I’ve never lost an entire array due to the drives. I’ve lost 2 Raid 5 arrays due to controller failures. I’ve manage to recover from failed disks just swapping the failed drive and letting the rebuild take place. I’m no expert by any stretch, but my practical experience is that there is no great risk in raid 5 for the typical SMB implementation. That said, I always have a good backup in place as well (I won’t put servers into my customers locations w/out a backup). I do have 2 new Equallogic SAN’s w/ sata drives that have both lost a drive in the past few months. IMO, the larger the physical disk, the larger the risk regarding disk failure.
My modest expierence is, that if something goes really wrong with a raid, it’s often the controller. And then get this specific controller / nas 3 years later… (we once waited a whole week for one to be shipped from the uk to switzerland…)
That’s why i highly prefer raid 10 over raid 5…
What RAID configuration would you recommend for someone who is doing 3d rendering (ala Maya or 3ds Max) and only has 3 drives to work with?
Raid 5? Raid 1 with a backup? Raid 0 with a backup?
I am new to RAID configurations so I appreciate any advice.
I don’t think I’ve ever saw a discussion split hairs as much as this one (and speaking on the internet the abasement of that statement can not be overestimated).
First of all, in a mission critical enterprise environment where cost really isn’t an issue, neither is your RAID array. You will have entire rack redundancy.
Second, RAID 5 is all the redundancy any small scale systems administrator will ever need. Anyone who isn’t budgeting for and routinely replacing drives based on MTBF are, to be blunt, terrible at their job.
Third, the idea that you can safely lose 2 drives in a 1+0 array but not a 5/6 array carries the logical acumen as your average lottery scratch card “system”.
In this day and age we have to assume any discussion of RAID redundancy and cost effectiveness is centered around SO/HO use, which is where RAID 5 shines brightest.
I used RAID 5 since the mid-90′s and had no problems. I had replaced failed drives and never had an issue until last year when a drive failed and during the rebuild I recieved the unrecoverable read error and that was it. We had to build a new array and restore from tape. That was a not-so-fun weekend.
I now run RAID 10 and 6 only and will probably go away from 6 in the future with the cost of drives going down.
I think some sort of reliable RAID should be used by all people. RAID 5 is not reliable with larger capacity drives since hard drives have become so unreliable, the chances are with a 2+TB array that a rebuild will fail even after a single disk failure. I’ve never liked or trusted RAID 5 in the first place.
If you can’t afford RAID 1+0 then go for RAID 1 at least even though there are no performance gains. RAID 1+0 at least allows for up to 2 drive failures (of course they can’t be part of the same RAID 1).
RAID 1+0 is the future but the main point is that RAID is not a replacement for backups. RAID technologies will always change the implementation with the technology based on the pros and cons. Eg. with SSD RAID 1 will be all we need and performance from 1+0 will not be necessary in my opinion.
We had an External 10TB enclosure (Enterprise Sata disk) SAS attached in a RAID 5 array. One disk died and while the hot spare was rebuilding (which as you can imagine took ages) we had another die also. Needless to say everything was lost, and a time consuming restore from backup was needed.
It’s now in a RAID 6 config to safeguard against one failure, but after this reading (and other material) I think the RAID 1+0 might have been a better scenario (although reduced storage implications).
Thanks for the article.
Here is what I know. (1 guys 2 cents worth)
Each systems I/O should be well known prior to selecting a storage solution…
Most systems that I need to design require at least 100TB of storage.
Raid 10 is cool where I need performance, often raid 5 is selected due to the price point.
What seems to be missing, in this conversation, is that in the real world the RTO is almost always 24 hours or more and a VM solution is in place so the risk is mitigated by the backup solutions…VmMotion, Log shipping, bit level file replication, and so on, as well as, other real world solutions.
The overall needs are what is important, so pick what works for your management.
Scott
O, ya one more thing, Think Cloud folks, most of my current solutions for redundancy do not consider the raid structure when the file are backed up in several locations at the bit level at almost real time for less that the cost of controllers and drives…
Management seems to love no capital cost solutions!
Again just my 2 cents worth,
Scott
Ok Scott W.,
What happens to your data when the “cloud” company either, a., goes under for whatever financial reason, b., gets hacked or DDoS’d because they pissed off the wrong geek, or c., some country’s government makes the service illegal or otherwise inaccessible, or spies on the data?
Sorry folks, data needs to stay local and under YOUR control. We can put apps in the cloud, but our data is far too important to entrust to some company out to make money off hosting it. And don’t give me the, “but no matter where it is it has to be secure and that costs money” argument. Yes, it does cost money to secure YOUR data. It’s part of what makes it important that you do so and ensure it STAYS YOUR DATA! Management needs to understand ALL the risks before putting their data in the hands of others, as should every citizen of every country in the world be wary.
Software as a service via the web, sure, all for it! Putting ALL my data and information about me in someone else’s hands that has NO thought for me other than to make a buck? HELL NO! Things *I* want to share, sure, for a period of time that is necessary I can put data somewhere on the Internet accessible to others. All of it, all the time…NO.
Jason,
If you put unencrypted data in the cloud you are asking for issues, to those of you who do, good luck with that and I hope your resume is up to date.
Vendor selection and diversification is also key, personally I am not too worried about Amazon or Microsoft’s financial stability at this time. Also where the data is stored is also important to me (With Amazon I get to pick what data centers the data is stored at…)
The “think cloud” assumes that you have a risk management view on remote storage. When you use a 2048 sized encryption key, where it is stored is a low risk proposition.
I do however disagree with thinking that apps are OK to put out there as most security issues around data leakage come from those that have authorized access to the data or the leakage comes via the application layer.
I appreciate your passion and views…
Regards,
Scott
http://www.linkedin.com/in/scottjwright
I’m sorry, you compared a 3 disk RAID 5 array with a 4 disk RAID 10 array, which is not very helpful. If you have 4 drives, RAID 5 will spread say 12 Mb of data as 4 drives of 4Mb raw data, that’s 3x a single drives performance for read and write. RAID 10 however will store 12Mb into 2 drives and then duplicate it – duplicating does not change performance, since drives need to be synced together for consistency. So the RAID 5 will store 4 MB or raw data per drive whilst the RAID 10 is storing 6Mb. for 4 drives, RAID 5 is in theory, 50% better for both read and write performance. More drives just means better results for RAID 5.
The only thing holding back performance now is the controller, which is of course, these days a waste of money, efficiency and another point of failure.
By using AHCI straight on the motherboard with Linux Software RAID and no write-back caching, you get close the theoretical performance by using the fastest available SATA controllers – SAS is just the same thing, with hotplug a certainty and better quality connectors.
I have re-created RAID5 after losing the partition table and got it all back, plus normal rebuild degradation is not an issue when the OS schedules it.
If you need more than 6 drives in the array for better random access, you will need to use a hardware controller, but these days, people use RAID 6 for large arrays, or even better a RAID aware file-system like ZFS or BTRFS.
why the recommended disks to create raid5 are 5
@Scott (The Pro-Cloud-Guy)
If you’re going to be in the cloud then at least you’ve got the good seense to encrypt the data right. That said you stil have less protection in the lng run with the cloud even if your data is encrypted. What is impossible/nearly impossible to crack today may not beso hard tomorrow or within a relatively short period of time down the road.
REMEMBER: No corproation, person or entity you contract with to hold, manage or otherwise handle your data is going to fight to protect said data as hard as you will and thats jut human nature and bussines. If the government decides it wants access to your data then no vendor and that includes the big boys like Amazon, is going to fight as hard as you would to keep said data out of the Feds ahnds. Encrypting that data does NOT guarantee its safety over the long haul.
The Cloud as a data storage solution is a bad idea for any data that you would normally keep secured in your own environment. The Cloud is just like the internet in that once it gets placed there you have no way to ensure you’re control over it.
At some point in the future (probably sooner then later) those entities that embraced the cloud for storage of private data are going to find out that they’ve made the same mistake as the fools who put every aspect of thier life out on the social media sites. NO matter how many promises of secuirty and or privacy or even of protection will keep that data safe over the long haul. The one differecne between the cloud embracers and the social media site users is that the CLud embracers will have paid more money.
As the old saying goes “Let the buyer beware”; so “let the cloud embracer beware”.
This is only comparing a 3 disk array to a 4 disk array – not exactly a fair comparison, granted, yes, RAID 10 is still faster than RAID 5.
Just my thoughts.. most organizations I have worked with have been cost conscious and have looked at the best price-performance factoring. My judgement has been to look at the need of the applications IOPS (including growth for next x years) and decide on the setup. I recommend having one RAID 10 array with 4 disks set on a SAN and rest disks as multiple arrays of RAID 50 (not 5). With enough cache power on controller, most of the bane of having parity created is nullified. The RAID 10 array provides the much needed write intensive operation volume (though may not be needed but I have a database background that keeps me locked with this notion). It is important to note that the cache available on the array plays a crucial role in the performance outcome. I keep this point in mind when procuring enterprise solutions. I am sure the equations will further change when we have SSD put in..
What a discussion. Regardless if you use RAID0,1,5,10 or any combination, if you rely on them instead of a backup you’re going to have a bad time.
There is no such thing as 100% data security, and statistical analysis of these risks requires advanced mathematics. There are too much factors at play to account for. In an ideal world you would have two or more realtime copies of your data in different parts of the world.
The major limiting factor in data security is budget. Since no RAID offers adequate protection for data loss the most cost effective solution is to make a differential copy of your data onto an external device. Either you go offsite, use a NAS/external drive or use a second array of internal drives. Each of these has different risks, and also different costs.
So in my opinion it’s 6>5>10>1. If shit hits the fan none of the levels can guarantee data safety. If a RAID10 has a failed drive and another drive hits a URE in the matching mirrored drive reconstruction is going to fail as well. The odds are slightly better than RAID5 and slightly worse than RAID6.
TL;DR: unless your data is not all that important to you do not use RAID as a substitute for backup.
If you unlucky and both A1 disks fail, then you have encountered total data lost where with RAID 5 you would need for 3 disks to fail.