Hardware vs. Software RAID in the Real World

Tags: , , , , ,


I’m a sysadmin by trade and as such I deal with RAID enabled servers on a daily basis. Today a server with a hardware RAID controller reported (when I say reported I actually mean lit a small red LED on the front of the machine) a bad disk, which is not uncommon. But when replacing the failed disk with a shiny new one suddenly both drives went red and the system crashed. Upon reboot the hardware controller says to me “sorry buddy, I don’t see any drives” and wouldn’t boot. Luckily I had a similar system sitting idle, so I tested the same disks in this server and they worked just fine. After cursing loudly at the RAID controller I started wondering if the pros of hardware raid really outweigh the cons for a general purpose, nothing special, 1 or 2U server that just needs disk mirroring.

[ad]

Pros of Hardware RAID

  • Easy to set up – Most controllers have a menu driven wizard to guide you through building your array or even are automatically set up right out of the box.
  • Easy to replace disks – If a disk fails just pull it out and replace it with a new one.
  • Performance improvements (sometimes) – If you are running tremendously intense workloads or utilizing an underpowered CPU hardware raid can offer a performance improvement.
  • Cons of Hardware RAID

  • Proprietary – Minimal or complete lack of detailed hardware and software specifications.
  • On-Disk meta data can make it near impossible to recover data without a compatible RAID card – If your controller goes casters-up you’ll have to find a compatible model to replace it with, your disks won’t be useful without the controller. This is especially bad if working with a discontinued model that has failed after years of operation
  • Monitoring implementations are all over the road – Depending on the vendor and model the ability and interface to monitor the health and performance of your array varies greatly. Often you are tied to specific piece of software that the vendor provides.
  • Additional cost – Hardware RAID cards cost more than standard disk controllers. High end models can be very expensive.
  • Lack of standardization between controllers (configuration, management, etc)– The keystrokes and software that you use to manage and monitor one card likely won’t work on another.
  • Inflexible – Your ability to reshape, split and perform other maintenance on arrays varies tremendously with each card.
  • Pros of Software RAID

  • Hardware independent – RAID is implemented on the OS which keeps things consistent regardless of the hardware manufacturer.
  • Standardized RAID configuration (for each OS) – The management toolkit is OS specific rather than specific to each individual type of RAID card.
  • Standardized monitoring (for each OS) – Since the toolkit is consistent you can expect to monitor the health of each of your servers using the same methods.
  • Good performance – CPUs just keep getting faster and faster. Unless you are performing tremendous amounts of IO the extra cost just doesn’t seem worthwhile.
  • Very flexible – Software raid allows you to reconfigure your arrays in ways that I have not found possible with hardware controllers.
  • Cons of Software RAID

  • Need to learn the software RAID tool set for each OS. – These tools are often well documented but not as quick to get off the ground as their hardware counterparts.
  • Typically need to incorporate the RAID build into the OS install process. – Servers won’t come mirrored out of the box, you’ll need to make sure this happens yourself.
  • Slower performance than dedicated hardware – A high end dedicated RAID card will match or outperform software.
  • Additional load on CPU – RAID operations have to be calculated somewhere and in software this will run on your CPU instead of dedicated hardware. I have yet to see a real-world performance degradation introduced by this, however.
  • Disk replacement sometimes requires prep work-You typically should tell the software RAID system to stop using a disk before you yank it out of the system. I’ve seen systems panic when a failed disk was physically removed before being logically removed.
  • [ad]

    So what do I make of all this?

    I can certainly understand the argument of simple deployment and having a vendor to blame. But at the end of the day if you loose your data then it’s gone. No vendor will be able to bring it back. And what about manageability after initial deployment?

    It seems to me that if you care about the integrity of your data and do not need ultra-intense IO performance then software RAID is a good choice. You’ll get the same core features as hardware with a standard management and monitoring suite. Additionally you are no longer bound to a specific piece of hardware. So in the event of a disk controller failure you could simply replace the controller (or the whole server) and the disks will remain usable. I also find it desirable that software raid will allow you to define the raid configuration during the installation process which when coupled with kickstart or a similar automated build process will allow you to ensure proper raid configuration and monitoring as soon as the server is brought online.

    Considering these benefits, and my sour experiences with hardware RAID, I plan to deploy software RAID on a much wider scale from this point forward. But I’m curious…

    Have others struggled with similar problems? How many people are running software RAID in an enterprise setting? And how many people have been similarly burned by software RAID?

    11 Responses to “Hardware vs. Software RAID in the Real World”

    1. Some Guy Says:

      Nice blog. I’ve been hoping other people would post with some experience, because I’m in the middle of a decision and am leaning toward software but just basically fear the unknown. One thing I really dislike about software RAID is that Red Hat rebuilds RAID10 volumes out of cron.weekly…any ideas why that might be? I disable it, but wonder why it would be there to begin with.

      [Reply]

    2. keith Says:

      Hmm, I’ve seen the 99-raid-check script in cron.weekly but I don’t think that it’s supposed to be rebuilding all your raid volumes. After some searching it appears that other people are experiencing this as well so you may be hitting a bug in this script. I think as long as you’re running mdmonitor you should be in pretty good shape though.

      [Reply]

    3. sergio Says:

      I have found out that software raid has a lower cost not only due to the cost of the card it self, but software raid will work with any disk, hardware raid you need to use certified disk drives with usually cost more money.

      For example I found out that the 3ware 9550sx raid controller works very well with certified disks by 3ware, however when using consumer level disks which are not certified it does not work well, massive timeouts, and eventually the disks develop bad sectors and then get degraded, those same disk they work fine with a software raid however with a much lower performance.
      My conclusion is that for hardware raid you have to use certain disks usually of higher quality.
      With software raid most disks will be fine.

      So when a disk dies, and using hardware raid it is best to get the same exact drive, with software raid it is not the case, any drive equal or larger will do.

      [Reply]

    4. iscaveline Says:

      hahha

      [Reply]

    5. Eddie Says:

      I’ve been using software raid but I’ve noticed linux reporting high memory usage. In total my server is at least visited once every second, where disk io is happening. In this case, it makes more sense to use hardware because of the massive ram usage due to software raid I presume.

      [Reply]

    6. Јаневски Says:

      @Eddie How do You know that the memory usage is from the software RAID?
      Have You checked?

      [Reply]

    7. Raihan Says:

      Which is good for dedicated server RAID-10 or RAID-1 with Hardware Raid card regarding disk IO performance and data recovery?
      Thanks in advance,
      Raihan

      [Reply]

    8. Patrick Says:

      I’ve been using software raid at my dedicated server for 6.5 years now (24/7) and the only thing that happened, was that one hdd was kicked out by the software for unknown reason after 4 years. Then I’ve re-added the drive and all data was there. Nothing bad happened to me. The power of the i/o performance is also not very bad. The server is still running…

      Then I tested a hardware raid (only for a consumer desktop pc) and the performance was ok, but it wasn’t stable. All the time run some rebuilds and some disks were kicked out. I think this is because the bad hardware controller and the not-certified hdds but you shouldn’t mess with a hardware raid if you don’t know what you’re doing.

      [Reply]

    9. Linus Says:

      I’ve been running 2 hardware raid mirrored hds in my server for about 2 years without problems. Then one day without warning ALL data is suddenly being reset to a state they were in 2 weeks earlier, and a lot of data was even overwritten by random binary code. Both hard disks were totally fine. The 3-ware controller appearently failed and destroyed a lot of hard work. NEVER again will I use hardware raid. Since then I am doing manual and automatic backups of important data, and I am now looking into the deployment of a software raid.

      [Reply]

      john b Reply:

      Linus,

      I had same issue with hardware RAID. The bottom line is the hardware RAID adapter becomes the single point of failure. I too am looking into software RAID, as well as other options such as single-disk server with block-level image CDP (Continous Data Backup) to external disk.

      Here is an interesting post from a futures trader who is also computer geek.
      http://augmentedtrader.wordpress.com/2012/05/13/10-things-raid/

      [Reply]

      Linus Reply:

      Thank you John, that is a very interesting read. And yes, the hardware RAID gave me a wrong sense of security. Although the hardware failure happened months ago, I still haven’t recovered 100% of my scripts and sites.

      Very encouraging that the article says, the actual amount of CPU being used by a software RAID on ubuntu is actually quite low.

      [Reply]

    Join the Conversation