Recovery Monkey: Musings on backups, storage, tuning and more

Choose a Topic:

Thu
14
Jan '10

Pillar claiming their RAID5 is more reliable than RAID6? Wizardry or fiction?

Competing against Pillar at an account. One of the things they said: That their RAID5 is superior in reliability to RAID6. I wanted to put this on the public domain and, if true, invite Pillar engineers to comment here and explain how it works for all to see. If untrue, again I invite the Pillar engineers to comment and explain why it’s untrue.

The way I see it: very simply, RAID5 is N+1 protection, RAID6 is N+2. Mathematically, RAID5 is about 4,000 times more likely to lose data than a RAID6 group with the same number of data disks. Even RAID10 is about 160 times more likely to lose data than RAID6.

The only downside to RAID6 is performance – if you want the protection of RAID6 but with extremely high performance then look at NetApp, the RAID-DP NetApp employs by default has in many cases better performance than RAID10 even. Oracle has several PB of DB’s running on NetApp RAID-DP. Can’t be all that bad.

See here for some info…

D

Sat
9
Jan '10

What if you could dramatically improve your application testing times? What would happen to your productivity and to the company’s bottom line?

So, let’s say the DBA (or insert some other discipline) wants to do some testing for a new product (known to happen occasionally) – and the way he would really like to test is to create 20 test cases, which requires 20 copies of the main database. He would then automate the test and therefore get results very quickly.

He approaches the storage admin with the problem, only to be told this isn’t possible since there isn’t enough space on the array. The DBA goes back to his cube frustrated, and figures out some ghetto way of creating at least 1 copy of the database, which creates the following problems:

  1. He has to figure out a way to do it (takes time)
  2. He can only test 1 case at a time (time)
  3. He cannot easily compare what-if scenarios between test cases (lack of flexibility)
  4. His ghetto way of doing it may involve single 1TB disks in a workstation (lack of reliability, time)

Ultimately, the testing takes longer, is error-prone, and the DBA’s productivity level goes way down.

What if the storage admin could, instead, tell the DBA that he can even take hundreds of copies of the DB, there’s no issue doing that?

What would happen to the DBA’s productivity?    

What new ideas would he be able to come up with?

How would that affect the quality of the product?

How would that affect the company’s bottom line? Being able to go to market with improved quality and quicker than the competition?

You see, intelligent storage – intelligently deployed – can solve many more problems than just “give me some space” or “give me more performance”.

There aren’t many technologies out there that can comfortably do this, which is probably why most storage people aren’t aware of this. But an array that can create space- and performance-efficient application-consistent DB clones is the ticket. Being able to create full copies and/or virtual space-efficient copies that end up being unusably slow doesn’t count… :)

The only vendor I know of that can pull this off (properly) is NetApp with their FlexClone technology. One can even use it to deploy thousands of identical VMs… there are some use cases for that, too :)

Activision (the company that makes the famous Guitar Hero game) is a good example of using this technology to rapidly accelerate development – and ended up making the Christmas deadline, which resulted in several more millions in sales. See here.

Oracle is another small company that uses this technology pervasively.

If anyone else knows of more vendors that can do this (properly) please chime in.

D

'

Should techies or business owners decide on technology (or both)?

It’s no secret that, in most companies, the technology folks are primarily the ones deciding on which new technologies to adopt – after all, they are the ones that understand the technology, right? Business owners explain the business problem to the technologists, and the techies take it from there – and ultimately present 2-3 different solutions that will work and the business picks the cheapest.

This could be great – if it weren’t for the fact that, like everyone, techies have their own agenda, which ends up tainting the decision process. Consider some of the following:

  • Comfort level with existing vendor (if it ain’t broke why fix it? This assumes all of the vendor’s products work equally well)
  • Job security (”why learn something new? Maybe they’ll hire someone that already knows this!”)
  • Delusions of grandeur (”I have the power!”)
  • Fear (”it sounds amazing, but what if the stuff doesn’t work?)
  • Disbelief (”my current gear can’t do this, there’s no way this new stuff is that good!”)
  • Laziness (”you mean I have to test this new stuff? It cuts into my online gaming time!”)
  • Envy (”my buddy at this other company has this stuff, I must have something cooler/bigger!”)
  • Lack of time (”I really don’t have the time to test this new stuff!”)
  • Vendor kickbacks (we all know it happens in one form or another, and to the perennially under-paid techies, an expensive gift may be something they will never otherwise be able to afford, so it gains huge importance in their eyes)
  • The inability to grasp the real business drivers
  • The inability to think strategically
  • Being wowed by “cool” features that are of dubious business importance (see other post here)
  • Conversely, not understanding features that could be of immense business importance, that could save the company millions and increase productivity tenfold.

Of course, someone like a CIO or CTO normally acts as the bridge that spans the techie and business worlds, but of course that doesn’t always work (see here).

The only way around the issue is to create a new decision process for the company, one that involves all the interested parties from all departments. As complex as it may sound, this does work, and most of the time new ideas/issues get unearthed (”what do you mean my database is not backed up now?” or “what do you mean it would take 2 weeks to recover my lab environment?”)

Try it, you may be surprised at what happens!

D