I still don't get it Cerb.
Drives aren't perfect, and data errors happen. If you use software RAID on Linux, FI, or w/ ZFS, and have all your logs turned up, you can see them having been fixed during scrubs. You can't fix any of them that occur during the time that a RAID 5 is effectively a RAID 0. The chances of them occurring during a parity RAID rebuild are much higher than a mirror-type RAID rebuild, due to the added load on all the drives, all the reading that must be done, and the long rebuild time (can be several days with big SATA drives). None of the drives need to fail.
RAID 5 provides redundancy and the ability to continue provide access even when a drive has failed. That is the whole point and it does it just fine.
That it does. What it doesn't provide is any protection for the data on the array, when degraded. Between the time the array degrades, and time it is rebuilt, which can stretch into days, the protection you have is that of a RAID 0. RAID 6 gives you the data protection of a RAID 5
while degraded. The rebuild itself effectively even gives you scrubbing during that time between degraded and optimal operational states. So, until drives reach the next point where it becomes likely for even RAID 6 to not make it through a rebuild (about 2020 based on the worst-case numbers, IIRC, so really another 5+ years on top of that, unless error rates get worse in upcoming drives).
In fact, RAID 5 is the most cost effective way to provide redundant storage.
And RAID 6 adding protection to that, typically for <=$200 more, which is nothing compared to your time, combined with potential productivity losses.
If another drive fails during the rebuild, you just simply create a new array and restore from backup.
Most people expect to put a new drive in and keep going. So you pop a new drive in, it fails the rebuild during working hours (IE, not the drive failing, but a bad stripe, which, depending on your controller an config, may or may not halt the rebuild entirely, but may necessitate a fsck or chkdsk, which is another avoidable downtime), after a days of being slow as molasses, hindering users, and now what? It could have been prevented with a different RAID setup, or you could have gone straight to backups. If the system is one where going to backups during work hours is fine,
and an efficient means of doing so is part of your DR planning, then that's fine.
Sounds like you're one of those guys that tries to pass his RAID off as a backup and are cruising to get burned.
If your array has a likelihood of failing during rebuild, however, and you can't get it rebuilt only during off hours, that's a risk that's generally easy to mitigate. The general point of RAID is so that you can keep on working without needing to go to backups (specifically, the downtime involved in that), and/or to protect your backups that are too large or too new for other storage mediums, especially when access to the volume is needed. That includes not only the drive functioning,
but also being able to reconstruct inaccessible data, be it through parity or a mirror. The chances of being able to do that are staying in favor of 6 and 10, but no longer in favor of 5, at least for arrays of big SATA drives, as the chances of not being able to read some sector grows with drive density/capacity.
Backups need to be done anyway, and even they might be done first and foremost to another machine's RAID array.
Agree.
I use RAID5 for its read/write speed. I also have a mirrored server. Last year when the main server went down, the mirrored server kicked in. Almost no down time. The main server syncs data real time ti the backup and I need fast read/write speed for that.
Aside from RAID 5 offering poor write speed for random IO, what you have there is a redundant whole server, effectively giving you a delayed mirror.
You effectively have RAID 5+1, just that the mirror portion is set up with eventual consistency. I'm only talking about a lone RAID 5 array, on a lone server, in the typical budget-limited SMB that's likely to want to make a 8TB 3-drive RAID 5 with 7200 RPM SATA drives . Another server storing the same data that an be brought right up changes everything, because in that case, there is no rebuild issue--you go ahead and treat the degraded array as failed, and can easily work from a fresh array when getting that server back up, if needed.