Announcement

Collapse
No announcement yet.

RAID5 failure: 2 bad HDD's at the same time

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Curious.George
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Originally posted by CapLeaker View Post
    Right now I have 2 RAIDs. A Raid 5 and a RAID 10. We will see how that goes.
    Nothing "wrong" with using RAID. My SAN has four 6-drive RAID6's in it. As I'm primarily using it in a R/O application, the "write penalty" doesn't come into play.

    The real issue is NOT relying on it to provide your (sole) backup!

    Leave a comment:


  • CapLeaker
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Right now I have 2 RAIDs. A Raid 5 and a RAID 10. We will see how that goes.

    Leave a comment:


  • Stefan Payne
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    RAID protects against drive failures, kinda...
    But you need to know the disadvantages or the effect it has on the other drives!

    For example, if you have a 4 drive RAID5 and one drive fails, you can still work.
    But if you have 4 identical drives, chances are the other 3 are also on their last legs and will die soon as well. As the Wear and Tear on all drives is similar...

    So you need to update your backup ASAP, get 4 new drives (or if its older, you might be able to get 2 and go for RAID1 or 3 Drive RAID5 or 4 Drive RAID6 instead) and copy the content of the RAID to the new storage.

    The old RAID you either keep for safety/emergency Backup or you destroy the drives. But you must not rebuild the array if its older than a couple of weeks...

    if you make it new and after a week, one drive fails, you can think of replacing the disc and rebuilding the Array...

    Leave a comment:


  • Curious.George
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Originally posted by eccerr0r View Post
    No, RAID1 is not backup, not any more RAID3, RAID5, or even RAID6.

    An off disk backup will protect you from rm -rf / . RAID of any level will not protect you.
    ANY "backup" protects you (there's a reason tape is still used!).

    And, the example suggests the user may have CARELESSLY typed "rm -rf". Had he intentionally (or unintentionally/carelessly) done any number of OTHER things he could still lose data.

    Change one line of code in a program -- then, try to remember WHICH line it was and what the original version happened to be!

    Crop that photo of your kid's birthday party -- and then try to recreate the imagery that was lost!

    Boost the gain in an audio file (so parts peak above 0dB) and then try to recover the original signal. Or, attenuate it and try to re-boost it, later, without dragging the noise floor up in the process.

    RAID (except 0) gives you availability -- it makes the data stored more accessible in a variety of adverse conditions. Just like ECC helps make RAM data available in adverse conditions.

    Leave a comment:


  • eccerr0r
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    No, RAID1 is not backup, not any more RAID3, RAID5, or even RAID6.

    An off disk backup will protect you from rm -rf / . RAID of any level will not protect you.

    There is something to be said with versioning filesystems as a type of backup. Coupled onto some sort of redundant RAID (i.e., not RAID0), this however does classify as a weak backup system. Still does not protect you from metadata loss of the versioning filesystem.

    I've been contemplating a versioning filesystem, but still will need to snapshot them to another disk set. Currently I only snapshot my RAID5 to another RAID (RAID1, incidentally) -- up until the capacity of the RAID1 gets exceeded...then it's time for a new disk...or a file purge...
    Last edited by eccerr0r; 08-02-2019, 02:08 PM.

    Leave a comment:


  • Stefan Payne
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Originally posted by Curious.George View Post
    RAID is good for enhancing throughput (RAID0) or enhancing availability -- sort of like ECC RAM enhances the availability of data stored in "memory" (but, you wouldn't RELY on ECC RAM for long term storage/backup).
    Well, it also, in theory, should help against one drive failures.
    The Problem is that if one drive fails, another one shortly follows...

    And you have to know the advantages and disadvantages of RAID and the RAID Levels. And that if you have a 4 Drive RAID5 with 4 drives of the same type, the wear might be similar on all of them, so they might fail in short succession.

    But RAID is _NEVER_ a backup. And must not be seen as such...
    Well, RAID1 maybe, kinda...

    Leave a comment:


  • Curious.George
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Originally posted by CapLeaker View Post
    This is the idea I have for later. I am going to upgrade sooner or later my NAS. So I going from a 4 bay to an 8 bay NAS. Run a 6 drive RAID 6 and with the other 2 bays two large capacity HDDs on JBOD.
    I'd still question the appropriateness of ANY RAID. Remember, all it provides is AVAILABILITY, not DURABILITY. The extra drives might be better used in another configuration (or even "kept cold")

    I've set up RAID on my SAN (and only on my SAN) because it essentially provides "C:" for the many virtual machines that are stored on it. When I activate a VM, I don't want to risk the VM hiccuping because of a disk error (e.g., a corrupt binary or otherwise unreadable file). But, I don't rely on the RAID to preserve my data any longer than that particular "session" (if the array crashes just as I log off, I don't care -- I'll see which drive(s) have failed, replace them and then reinitialize their content from my archives).

    Leave a comment:


  • CapLeaker
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    This is the idea I have for later. I am going to upgrade sooner or later my NAS. So I going from a 4 bay to an 8 bay NAS. Run a 6 drive RAID 6 and with the other 2 bays two large capacity HDDs on JBOD.

    Leave a comment:


  • Curious.George
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Originally posted by CapLeaker View Post
    I am building a new RAID now. Coping files like stupid! Well I am not sure if Iam going to play with the old RAID drives, since they are still in warranty and I like to get them replaced if I can. I didn't go for a new NAS. I just went for new HDD's. Maybe I look for a new NAS later in the fall.
    You might want to consider setting up the drives as JBOD and using some number of them "offline" to hold backups.

    RAID is good for enhancing throughput (RAID0) or enhancing availability -- sort of like ECC RAM enhances the availability of data stored in "memory" (but, you wouldn't RELY on ECC RAM for long term storage/backup).

    If you can tolerate the NAS (as JBOD) throwing an error from time to time and using that to prompt you to drag out the "backup", this may be a more effective use of the platters. Your AVAILABILITY goes down (cuz the data wasn't reliably present when you wanted it) but your DURABILITY goes up (cuz the data wasn't lost!).

    I'd still advocate playing with some of the recovery tools just to see if you'll have a fallback "out" in the future... (some vendors just roll a Linux distro into their appliances; others reinvent the wheel -- or, deliberately obfuscate some standard implementation just to tie you more closely to them)

    Leave a comment:


  • CapLeaker
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    I am building a new RAID now. Coping files like stupid! Well I am not sure if Iam going to play with the old RAID drives, since they are still in warranty and I like to get them replaced if I can. I didn't go for a new NAS. I just went for new HDD's. Maybe I look for a new NAS later in the fall.

    Leave a comment:


  • Curious.George
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Originally posted by CapLeaker View Post
    No, not all of my contents are lost. I was able to borrow another similar NAS.
    Perhaps you see why I have TWO of every piece of hardware?

    Over Prime Day on Amazon I've ordered 4 new HDD's. Plan is to "replace" the old raid with a new raid. Basically the old raid goes into the borrowed NAS. My NAS is getting all new drives then copy everything back.
    If you feel ambitious/inquisitive, you may want to keep the RAID drives (after you've returned the borrowed equipment hosting them, presently) and "play" with some of the RAID recovery tools that are available. Your data will already have been recovered so the drives' contents are "disposable" (?). I.e., you can play with them (and software recovery tools) without risk of LOSING anything.

    That could give you a head start if you find yourself in a similar situation at a future date (i.e., you will KNOW that it can be done and HOW to do it -- instead of HOPING it can be done and stressing over HOW to do it!)

    Interesting experience, eh?

    Leave a comment:


  • CapLeaker
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    No, not all of my contents are lost. I was able to borrow another similar NAS. Over Prime Day on Amazon I've ordered 4 new HDD's. Plan is to "replace" the old raid with a new raid. Basically the old raid goes into the borrowed NAS. My NAS is getting all new drives then copy everything back.

    Leave a comment:


  • Curious.George
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Originally posted by CapLeaker View Post
    I can clone it with dd or Clonezilla no problem, but my NAS sees it as a new HDD.
    If it is truly cloning the entire media surface, then the NAS must have some NVRAM in which it stores data from drive inquiry commands. E.g., I track drives in my "disk sanitizer" by storing the serial number, model number, etc. from the drive inquiry in a large database. So, when I next encounter the drive (e.g., when I install an OS image), I know its history.

    Usually, the drive is used to store this stuff (in a special partition or in the "unused" area right after the MBR).

    Regardless, this is one of the ways RAID f*cks you; had that been a "regular" disk, you could have thrown it in another machine and accessed its contents like normal (losing whatever part of the disk that may be afflicted with UREs).

    If you've already written off the data (as lost), you could try to recover the contents using one of the Windows/Linux tools that claim to be able to do so. At the very least, it will be a learning experience (and COULD yield positive results).

    Google "raid recovery" (and, please, report on any results!)
    Last edited by Curious.George; 07-14-2019, 10:45 AM.

    Leave a comment:


  • CapLeaker
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    I can clone it with dd or Clonezilla no problem, but my NAS sees it as a new HDD.

    Leave a comment:


  • Curious.George
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Originally posted by CapLeaker View Post
    that's why i thought it's not possible. I have to wait for some drives. Prime day is coming and I need a shit load of HDD's and a new NAS.
    dd(1) should clone the drive completely (there may be some issues with portions of the MBR under some OS's).

    Of course, now you're faced with the time it takes to read the entire medium.

    And, the real possibility that dd(1) will encounter a URE somewhere along the way (you'll have to sort out what "value" should be substituted for the "unknown" value, in that case).

    ISTR CZ has an option to just fall into dd(1) mode (instead of trying to understand the filesystem's structure)...?

    Leave a comment:


  • CapLeaker
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    that's why i thought it's not possible. I have to wait for some drives. Prime day is coming and I need a shit load of HDD's and a new NAS.

    Leave a comment:


  • Curious.George
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Originally posted by CapLeaker View Post
    Cloning the HDD with Clonzilla, didn't work for me.
    Without knowing how (and WHERE!) the particular NAS stores the array configuration data on the drive, there's no way of knowing if CZ will even SEE it as "data". CZ cheats by only copying the portions of the drive that it KNOWS to contain data (i.e., by understanding file systems and other common disk structures). This lets it skip over the parts of the medium that it thinks are "empty" -- otherwise CZ would take as long as a bytewise copy operation.

    (Watch CZ in action and you will see how the thruput changes over the course of the operation)

    You may have to resort to a bytewise copy to be sure you are preserving all of the "stuff that matters" -- to your NAS!

    And, you're still stuck with the highly likely URE interfering with that operation -- the U in URE -- without the benefit of the redundant drives to compensate for it.

    16TB = 128,000,000,000,000 bits = 1.28 x 10^14. Assume a URE rate of 1 in 10^14...

    Leave a comment:


  • CapLeaker
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Cloning the HDD with Clonzilla, didn't work for me.

    Leave a comment:


  • Stefan Payne
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Originally posted by CapLeaker View Post
    Interesting... So you are saying to clone the bad HDD's in the RAID 5 array with clonezilla to a new drive and put it back into the array?
    Its worth a try.
    You might want to clone the other HDDs as well or move them immediately over to a new RAID Array.

    Originally posted by CapLeaker View Post
    I thought the array knows the HDD by serial number or something, so it would detect it as a "new" drive?
    No, that should be written in the MBR or wherever it does that.



    Anyway, rule of the thumb:
    If one Drive in a RAID Array dies, do not rebuild it, backup your data and move it over to another Array!

    Because when all are the same make/model, other drives failing is highly likely.

    Leave a comment:


  • Curious.George
    replied
    Re: RAID5 failure: 2 bad HDD's at the same time

    Originally posted by CapLeaker View Post
    that is what I am aiming for, something where 2 drives can fail. Anyone tried the SHR2 from Synology?
    Note that you don't need a second "disk failure" -- a URE (during the rebuild) will effectively render a RAID5 (w/ failed disk) "broken". Make sure your NAS is doing patrol reads of the entire array lest you discover that URE when you can least afford it!

    Leave a comment:

Working...
X