PDA

View Full Version : What to do with software raid?



susercius
29-Mar-2014, 07:58
Ciao to everybody from Italy!
The title I've used could be read as a stupid one. Ok, it could be, but...
I've spent a lot of time to search the web for "what to do if a Software Raided disks (Raid1) have problems"? Meaning that I'm not sure about the actions to be performed for data recovery in case of a disk failure.
In depth, the configuration I would use is:

/dev/sda = 1 TB (or more). /dev/sdb same as sda.
/dev/sda1=12GB with all the stuff needed to work (kernel, boot and so on). This will be the boot disk.
/dev/sda2 (1TB-12GB) as RAID1.
The same RAID configuration will be applied on sdb (1TB-12GB) The first 12GB on sdb1 could be empty or could be a mirrored image of sda1 via "dd" command..

Then what to do if sda1 (or sdb1) goes to hell?
All my data (/dev/md0) are still available or should I kill myself? :(

A friend suggested to avoid the use of RAID and switch to LVM along with a solid NAS for keep data saved.
ANY answer is really welcome.
Thanks to everybody and forgive my poor english.

cjcox
29-Mar-2014, 21:16
On 03/29/2014 02:04 AM, susercius wrote:
>
> Ciao to everybody from Italy!
> The title I've used could be read as a stupid one. Ok, it could be,
> but...
> I've spent a lot of time to search the web for "what to do if a Software
> Raided disks (Raid1) have problems"? Meaning that I'm not sure about the
> actions to be performed for data recovery in case of a disk failure.
> In depth, the configuration I would use is:
>
> /dev/sda = 1 TB (or more). /dev/sdb same as sda.
> /dev/sda1=12GB with all the stuff needed to work (kernel, boot and so
> on). This will be the boot disk.
> /dev/sda2 (1TB-12GB) as RAID1.
> The same RAID configuration will be applied on sdb (1TB-12GB) The first
> 12GB on sdb1 could be empty or could be a mirrored image of sda1 via
> "dd" command..
>
> Then what to do if sda1 (or sdb1) goes to hell?
> All my data (/dev/md0) are still available or should I kill myself? :(
>
> A friend suggested to avoid the use of RAID and switch to LVM along with
> a solid NAS for keep data saved.
> ANY answer is really welcome.
> Thanks to everybody and forgive my poor english.

Usually software RAID1 is used where no reliable storage alternative is available.

Also, usually it means the drives are not hot pluggable. This difficulty is
that replacing the failed drive in the software raid requires a power down,
replace, and power up.

Cheap NAS is a possible answer for "reliability".. but it's *very* slow. Most
supposed gigabit NAS's do maybe 200Mbit on a good day (ok if you don't want more
than 10-20MB/sec). There are good NAS alternatives out there, but they will be
pricey. Many of those cheap NAS solutions do offer some kind of RAID (which
many times is software style RAID behind the scenes) and the ability to hot
swap/replace failed drives.

While Linux software RAID will allow any block device pretty much, including
individual partitions, you probably can see the difficulty. Even if you do
mirror root and boot... the host won't simply "work" if the primary drive fails,
you'll have to do a bit of work to make sure things boot ok off the other drive.
It actually helps to build a test box and see the *work* required for yourself.

So... for full system reliability, helps if all the storage is RAID through and
through. But most people can live with just their crucial data on a RAID
subsystem or NAS... so if the system fails, they just have to somehow rebuild
the base system and then remount their protected crucial data that comes off
some kind of reliable storage.

So.... software RAID is useful. And yes, your md0 on a RAID1 will continue to
work if you lose a drive (and you don't have the root/boot problem mentioned
earlier).

There are cheap RAID1 subsystems, even internal subsystems, you can buy that
mirror at the device level rather than at the OS level (that is, they are OS
agnostic). But they usually come with a price tag and may also have other
requirements that make it more difficult to install/support. But I've used them
(e.g. https://www.accordancesystems.com/)

Some options:

1. Internal RAID1 Subsystem
Pros: Mirrors whole drive at the device level. Doesn't care about OS used.
Easy to use and easy to replace bad drives. Low overhead, fairly fast.
Cons: Pricey at $300 - $450USD

2. Software RAID1
Pros: Very cheap. Very fast (somewhat depends on RAID level and CPU used).
Cons: Somewhat hard to use. Can be very hard when partitions, boot and root are
involved.

3. HW RAID
Pros: Pretty fast
Cons: Pricey at $300 - 1000+. Firmware support is always limited so it might
not work with Linux forever.

4. NAS
Pros: Pretty easy to use, generally doesn't care too much about OS used.
Cons: Very pricey for a good ones, moderately pricey for cheap ones. Most
beneficial to data areas and not the OS drive. Affordable units (<$500) are
usually very slow (10-20MB/sec).

5. SAN
Pros: Very fast. Very flexible and reliable.
Cons: Extremely expensive for good performance.

My favorite option for people on a budget wanting very reliable RAID1 for all
data is to recommend a RAID1 internal storage subsystem (noting that most OEM
desktop will have trouble housing some solutions).

I currently run a HW RAID at home off a high end Adaptec RAID controller. At
work we use mostly SAN (though it's just gigabit iSCSI). I have used (that is,
designed and deployed) everything from 8Gbit FC SAN and 10GbE NAS to the
mentioned Accordance internal subsystems to pure Linux software RAID.

susercius
30-Mar-2014, 09:55
Thank you, cjcox, your answer is a vary nice answer covering lot of "other" questions but does not give me any solution or any path to follow to solve the doubt.
Maybe I've posted the wrong question; I' try to correct.......... So the real worst "scenario" is this:
My server (a Dell PowerEdge 110 II) has two 1TB disks The first (that I will call HD1) is partioned as follows: /dev/sda1 (12GB as boot unit+kernel & all the O.S. stuff), 4 GB as SWAP. The remainig space is "/dev/md0". The second HD (named HD2) has almost the same partition schema: 12GB+4G empty, and its'remaining space used as the second unit for RAID: RAID1 I mean, mounted as /home.

Then the question is : if HD1 loses the first partition "/dev/sda1" or loses everything (sda1 AND md0: a fatal hw-crash) the data should still exist on HD2, isn't it?
So what actions must be performed the get them back maybe on en external temporary device?.
I've seen that a live distro (KNOPPIX e.g.) does not see /dev/md0.
So?
Thanks.

cjcox
31-Mar-2014, 04:51
On 03/30/2014 04:04 AM, susercius wrote:
>
> Thank you, *cjcox*, your answer is a vary nice answer covering lot of
> "other" questions but does not give me any solution or any path to
> follow to solve the doubt.
> Maybe I've posted the wrong question; I' try to correct.......... So
> the real worst "scenario" is this:
> My server (a Dell PowerEdge 110 II) has two 1TB disks The first (that I
> will call *HD1*) is partioned as follows: /dev/sda1 (12GB as boot
> unit+kernel & all the O.S. stuff), 4 GB as SWAP. The remainig space is
> "/dev/md0". The second HD (named *HD2*) has almost the same partition
> schema: 12GB+4G empty, and its'remaining space used as the second unit
> for RAID: RAID1 I mean, mounted as /home.
>
> Then the question is : if *HD1* loses the first partition "/dev/sda1" or
> loses everything (sda1 AND md0: a fatal hw-crash) the data should still
> exist on *HD2*, isn't it?

Sorry, didn't mean to be obtuse.. yes... md0 will still operate, your data
should be there.

> So what actions must be performed the get them back maybe on en
> external temporary device?.
> I've seen that a live distro (KNOPPIX e.g.) does not see /dev/md0.
> So?
> Thanks.

The md service has to be started to understand md0.

I haven't tested this on SLES 11, but if your / and /boot are all on sda1 and
you have that mirrored (md0) with sdb1 then ideally through YaST (we hope) you
will have a system that can boot using either drive. You could test by removing
sda. But in your case, sounds like maybe you're just mirroring /home (right?).
The data should be preserved ok, but if sda went out, you'd still have to boot
something up to get out the data by reestablishing the RAID1 (md0). SUSE's
recovery likely has what you need to do this, but again, I haven't tested it.

susercius
31-Mar-2014, 07:10
Sorry, didn't mean to be obtuse.. yes... md0 will still operate, your data
should be there.

First of all, your are NOT obtuse otherwise you wouldn't spend time to answer me. Thanks :)
Second: I hope that md0 is still alive..

The md service has to be started to understand md0.

I haven't tested this on SLES 11, but if your / and /boot are all on sda1 and
you have that mirrored (md0) with sdb1 then ideally through YaST (we hope) you
will have a system that can boot using either drive. You could test by removing
sda. But in your case, sounds like maybe you're just mirroring /home (right?).
The data should be preserved ok, but if sda went out, you'd still have to boot
something up to get out the data by reestablishing the RAID1 (md0). SUSE's
recovery likely has what you need to do this, but again, I haven't tested it.[/QUOTE]


Third: I must check, but if I remeber SLES does not allow the booting from a mirrored "/" or "/boot". Surely most of H/W raids available on low end servers (HP/Ibm/Dell) do not allow. Linux (SLES 11.xx) reject such operation. Anyway I'll post the final answer tomorrow, just after my new test.
Thank you again.

jmozdzen
31-Mar-2014, 15:56
Hi susercius,

Third: I must check, but if I remeber SLES does not allow the booting from a mirrored "/" or "/boot". Surely most of H/W raids available on low end servers (HP/Ibm/Dell) do not allow. Linux (SLES 11.xx) reject such operation. Anyway I'll post the final answer tomorrow, just after my new test.
Thank you again.

(I suggest to use the "reply with quote" feature of this board, it make reading much easier... and if you manually insert further
... tags, you can even multi-quote.)

I'm running some SLES11 servers where *all* partitions (/boot, LVM PV, others) are created via software RAID. This works, but has to be hand-crafted in terms of boot options: First of all, you have to prepare your server to try to boot from multiple disks, then set up grub to try /boot from multiple disks in order, and lastly you'll need a boot sector on both physical disks.

Booting from a hardware RAID - I don't see the problem here. Most RAID cards are supported under Linux, so booting from a hardware RAID set should be no problem.

Even with software RAID: If one of the devices fails (disk/partition), the RAID will be degraded but still works - no data loss. You then can re-activate the original "device" (i.e. /dev/sdb2, in case the disk was temporarily unavailable) or add a new device and the software will re-create the RAID volume for you.

While software RAID is very inexpensive, I do prefer hardware RAID for servers for anything that goes beyond simple boot support and low-volume i/o. It'll cost you an appropriate RAID card, but saves you a lot of possible pitfalls when configuring and running your server.

Using "NAS" doesn't compare to RAID at all - two different stories, in my eyes. "NAS" describes *how* you'll access the data. But if you want to be safe against harddisk failures, the NAS box will have to use RAID. So why complicate things by putting the RAID functionality into an extra box and depend on a network link to access the data? I'd then rather run the RAID inside my server - and do RAID for /boot and everything else from the disks, too. But that's just *my* opinion...

Regards,
Jens

susercius
02-Apr-2014, 09:32
Hi susercius,

I'm running some SLES11 servers where *all* partitions (/boot, LVM PV, others) are created via software RAID. This works, but has to be hand-crafted in terms of boot options: First of all, you have to prepare your server to try to boot from multiple disks, then set up grub to try /boot from multiple disks in order, and lastly you'll need a boot sector on both physical disks.

Booting from a hardware RAID - I don't see the problem here. Most RAID cards are supported under Linux, so booting from a hardware RAID set should be no problem.


I did just now using a "simple and linear installation" using all the space of available disks to be joined in RAID1, so it works.



Even with software RAID: If one of the devices fails (disk/partition), the RAID will be degraded but still works - no data loss. You then can re-activate the original "device" (i.e. /dev/sdb2, in case the disk was temporarily unavailable) or add a new device and the software will re-create the RAID volume for you.


True.



While software RAID is very inexpensive, I do prefer hardware RAID for servers for anything that goes beyond simple boot support and low-volume i/o. It'll cost you an appropriate RAID card, but saves you a lot of possible pitfalls when configuring and running your server.


Correct, but consider that the cost of this Dell-server is less then 350/400€, while the cost of a good cheapest HW-Raid (Adaptec 6505) is near 300/350€ plus the stuff needed to hold disks and so on that must be added to the cost of the disks themselves...:p



Using "NAS" doesn't compare to RAID at all - two different stories, in my eyes. "NAS" describes *how* you'll access the data. But if you want to be safe against harddisk failures, the NAS box will have to use RAID. So why complicate things by putting the RAID functionality into an extra box and depend on a network link to access the data? I'd then rather run the RAID inside my server - and do RAID for /boot and everything else from the disks, too. But that's just *my* opinion...


Also this point is correct. But view the problem from other points:
1. Ok for RAID 1 (a SW one, because the HW solution is too expensive in my case)
2. Take in the highest consideration what type of data is handled with this server.
3. Take in the highest consideration too, what type of users are working with this server.:(
4. Take in the highest consideratione also the whole timing required to rebuild a RAID, compared with the shortes time for availability of their data, as imposed by the customer.:mad:

So, after a summation of all the previous points I think that a "stupid disk" should be readable from almost any PCs. Such a target could be hit with a simple, low cost NAS handling 1 (only 1) spare disk that (obvoiusly) must contain all the data of RAID1.

Any comment will be really welcome.

Regards to the readers
S.

KBOYLE
04-Apr-2014, 03:43
susercius wrote:

> Any comment will be really welcome

NAS is easy to install and use, especially if you just need data
storage. But even software RAID is likely to provide much better
performance than an inexpensive NAS.

It all depends what you are looking for.

--
Kevin Boyle - Knowledge Partner
If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...

jmozdzen
07-Apr-2014, 16:57
Correct, but consider that the cost of this Dell-server is less then 350/400€, while the cost of a good cheapest HW-Raid (Adaptec 6505) is near 300/350€ plus the stuff needed to hold disks and so on that must be added to the cost of the disks themselves...:p

I have servers with lower hardware costs, "pimped" with Fiber Channel HBAs of much higher costs :D It's always a question of what you need and what you can get... and in your use case, software RAID seems to be a good option.


Also this point is correct. But view the problem from other points:
1. Ok for RAID 1 (a SW one, because the HW solution is too expensive in my case)
2. Take in the highest consideration what type of data is handled with this server.
3. Take in the highest consideration too, what type of users are working with this server.:(
4. Take in the highest consideratione also the whole timing required to rebuild a RAID, compared with the shortes time for availability of their data, as imposed by the customer.:mad:

So, after a summation of all the previous points I think that a "stupid disk" should be readable from almost any PCs. Such a target could be hit with a simple, low cost NAS handling 1 (only 1) spare disk that (obvoiusly) must contain all the data of RAID1.

I definitely don't follow you here. Since you're talking about wanting/needing RAID, you'll need it with a NAS, too. Anything that counts against RAID holds true with a NAS as well (esp. "4."). In other words: You need to compare server+RAID against server+NAS+RAID... the latter adds a layer of complexity and possibly another bottle neck (LAN access).

If your server fails, you'll need to set up another server anyhow to re-establish customer's access to the data. You'll be able to re-use the disks.

If (one of your) disks fails, you'll be rebuilding "live" anyhow, whether it's software RAID or some cheap hw RAID in the NAS. No difference, no down time

If you have a NAS, that enclosure can fail and you need a second, compatible one to re-establish access (you have no control over how the disks are "formatted", this doesn't have to be compatible across vendors or models)

If you have a NAS, that adds further components that may fail.

In summary, software RAID is cheaper (no NAS required) and more reliable (less points of failure) and easier to maintain (less complexity, possibly easier disk migration comparing NAS failure to server failure).

Just my 2 cents...

Regards,
Jens

susercius
10-Apr-2014, 10:18
I have servers with lower hardware costs, "pimped" with Fiber Channel HBAs of much higher costs :D It's always a question of what you need and what you can get... and in your use case, software RAID seems to be a good option.



I definitely don't follow you here. Since you're talking about wanting/needing RAID, you'll need it with a NAS, too. Anything that counts against RAID holds true with a NAS as well (esp. "4."). In other words: You need to compare server+RAID against server+NAS+RAID... the latter adds a layer of complexity and possibly another bottle neck (LAN access).

If your server fails, you'll need to set up another server anyhow to re-establish customer's access to the data. You'll be able to re-use the disks.

If (one of your) disks fails, you'll be rebuilding "live" anyhow, whether it's software RAID or some cheap hw RAID in the NAS. No difference, no down time

If you have a NAS, that enclosure can fail and you need a second, compatible one to re-establish access (you have no control over how the disks are "formatted", this doesn't have to be compatible across vendors or models)

If you have a NAS, that adds further components that may fail.

In summary, software RAID is cheaper (no NAS required) and more reliable (less points of failure) and easier to maintain (less complexity, possibly easier disk migration comparing NAS failure to server failure).

Just my 2 cents...

Regards,
Jens

Near my desk there is a picture that shows my effective thought.
This picture is depicting the Pink Panther and near, there is written:
“The GENIUS is something that you cannot explain".
This is for you Jens, you are the one.
Thanks for precious suggestions.
S.