Recovery Monkey: Musings on backups, storage, tuning and more

Choose a Topic:

Sat
7
Aug '10

FUD tales from the blogosphere: when vendors attack (and a wee bit on expanding and balancing RAID groups)

Haven’t blogged in a while, way too busy. Against my better judgment, I thought I’d respond to some comments I’ve seen on the blogosphere, adding one of my trademark extremely long titles. Part response, part tutorial. People with no time to read it all: Skip to the end and see if you know the answer to the question or if you have ideas on how to do such a thing.

It’s funny how some vendors won’t hesitate to wholeheartedly agree when some “independent” blogger criticizes their competition (before I get flamed, independent in quotes since, as I discussed before, there ain’t no such thing whether said blogger realizes it or not – being biased is a basic human condition).

The equivalent of someone posting in an Audi forum about excessive brake dust, and having guys from Mercedes and BMW chime in and claim how they “tested” Audis and indeed they had issues (but of course!) and how their cars are better now and indeed maybe Audi doesn’t have as much of a lead any more (if, indeed, they ever did). I think the term for that is “shill” but I can understand taking every opportunity to harm an opponent.

So the “Storage Architect” posted entries asking about certain features to be implemented on NetApp storage, one of them being able to reduce the size of an aggregate. Then everyone and their mum jumped on and complained how on earth such an important feature isn’t there… :) BTW – I’m not saying such a thing wouldn’t be useful to have from time to time. I’ll just try to explain why it’s tricky to implement and maybe ways to avoid problems.

For the uninitiated, an “aggregate’ is a collection of RAID-DP RAID groups, that are pooled, striped and I/O then hits all the drives from all RAID groups equally for performance. You then carve out volumes out of that aggregate (containers for NFS, CIFS, iSCSI, FC).

A pretty simple structure, really, but effective. Similar constructs are used by many other storage vendors that allow pooling.

So, the question was, why not be able to make an aggregate smaller? (you can already make it bigger on-the-fly, as well as grow or shrink the existing volumes within).

An HP guy them proceeded to complain about how he put too few drives in an aggregate and ended up with an imbalanced configuration while trying to test a NetApp box.

So, some basics… the following picture shows a well-balanced pool – notice the equal number of drives per RAID group:

image

The idea being that everything is load-balanced:

image

Makes sense, right?

You then end up with pieces of data across all disks, which is the intent. Growing it is easy – which is, after all, what 99.99% of customers ever want to do.

However, the HP dude didn’t have enough disks to create a balanced config with the default-sized RAID group (16). So he ended up with something like this, not performance-optimal:

image

So what the HP dude wanted to do, was to reduce the size of the RAID group and remove drives, even though he expanded the aggregate (and by extension the RAID group) originally.

Normally, before one starts creating pools of storage (with any storage system), one also knows (or should) what  one has to play with in order to get the best overall config. It’s like – “I want to build a 12-cylinder car engine, but I only have 9 cylinders”. Well – either buy more cylinders, or build an 8-cylinder engine… don’t start building the 12-cylinder engine and go “oops” :) This is just Storage 101. Mistakes can and do happen, of course.

So, with the current state of tech, if I only had 20 drives to play with (and no option to get more), assuming no spares, I’d rather do one of the following:

  1. Aggregate with 10 + 10 RAID groups inside or…
  2. Use all 20 drives in a single RAID group for max space
  3. Ask someone that knows the system better than I do for some advice

This is common sense and both doable and trivial with a NetApp system. The idea is you set the desired RAID group size for that aggregate BEFORE you put in disks. Not really difficult and pretty logical.

For instance, aggr options HPdudeAggr raidsize 10 before adding the drives would have achieved #1 above. Graphically, the Web GUI has that option in there as well, when you modify an aggregate. The option exists and it’s well-known and documented. Not knowing about it is a basic education issue. Arguing that no education should be needed to use a storage device (with an extreme number of features) properly even for deeply involved, low-level operations, is a romantic notion at best. Maybe some day. We are all working hard to make it a reality. Indeed, a lot of things that would take a really long time in the past (or still, with other boxes) have become trivialized – look at SnapDrive and the SnapManager products, for instance.

Back to our example: if, in the future, 10 more disks were purchased, and approach #1 above was taken, one would simply add the ten disks to the aggregate with aggr add HPdudeAggr 10. Resulting in a 10+10+10 config.

But what if I had done #2 above (make a 20-drive RAID group the default for that aggregate)?

Then, simply, you’d end up imbalanced again, with a 20+10. Some thought is needed before embarking on such journeys.

Maybe a better approach would be to add, say, a more reasonable number of drives to achieve good balance? Adding 12 more drives, for example, would allow for an aggregate with 16+16 drives. So, one could simply change the raidsize using aggr options HPdudeAggr raidsize 16, then, add the 12 disks to the aggregate with aggr add HPdudeAggr –g all 12. 

This would expand both RAID groups contained within the aggregate dynamically to 16 drives per, resulting in a 16+16 configuration. Which, BTW, is not something you can easily do with most other storage systems…

Having said all that, I think that for people that are not storage savvy (or for the storage savvy that are suffering from temporary brain fog), a good enhancement would be for the interfaces to warn you about imbalanced final configs and show you what will be created in a nice graphical fashion, asking you if you agree (and possibly providing hints on how it could be done better).

I’m not aware of any other storage system that does that degree of handholding but hey, I don’t know everything.

Indeed, maybe the nature of the other posts was being bait so I’ll obligingly take the bait and ask the question so you can advertise your wares here: :)

Is anyone aware of a well-featured storage system from an established, viable vendor that currently (Aug 7, 2010, not roadmap or “Real Soon Now”) allows the creation of a wide-striped pool of drives with some RAID structures underneath; then allows one to evacuate and then destroy some of those underlying RAID groups selectively, non-disruptively, without losing data, even though they already contain parts of the stripes; then change the RAID layout to something else using those same existing drives and restripe without requiring some sort of data migration to another pool and without needing to buy more drives? Again, NOT for expansion, but for the shrinking of the pool?

To clarify even further: What the HP guy did was exactly this: He had 20 drives to play with, he created by mistake a pool with 2 RAID groups, 14+2 and a 2+2, how would your solution take those 2 RAID groups, with data, and change the config to something like 10 + 10 without needing more drives or the destruction of anything?

Can you dynamically reduce a RAID group? (NetApp can dynamically expand, but not reduce a RAID group).

I’m not implying such a thing doesn’t exist, I’m merely curious. I could see ways to make this work by virtualizing RAID further. Still, it’s just one (small) part of the storage puzzle.

The one without sin may cast the first stone! :)

D

Technorati Tags: ,,

Mon
24
May '10

Et tu, Brute? EMC offering capacity guarantees? The sky is falling! Will Chuck resign?

It came to my attention that EMC is offering a 20% efficiency guarantee vs the competition (they seem to be focusing on NetApp as usual but that’s besides the point in this post). See here.

Now, I won’t go ahead and attack their guarantee. Good luck with that, more power to you etc etc. They need all the competitive edge they can get.

No, what I’ll do is expose yet more EMC messaging inconsistency. If you’ve been following the posts in my site you’ll notice that I have absolutely nothing against EMC products – but I do have issues with how they’re sold and marketed and what they’ll say about the competition.

First and foremost – most major storage players, with the notable exception of EMC, have been offering some kind of efficiency guarantee. Sure, you needed to read the fine print to see if your specific use case would be covered (like with every binding document), but at least the guarantees were there. NetApp was first with our 50% efficiency guarantee, then came others (HDS and 3Par are just some that come to mind). We even offer a 35% guarantee if we virtualize EMC arrays :)

We all have different ways of getting the efficiency – NetApp has a combo of deduplication, thin provisioning, snapshots, highly efficient RAID and thin cloning, for instance. Others have a subset (3Par has their really good thin provisioning, for instance). Regardless, we all tried to offer some measure of extra efficiency in these hard economic times.

And it’s not just marketing – I have multiple customers that, especially on virtualized environments, save at least 70% (that’s a real 70%, not 70% because we switched them from RAID10 to RAID-DP – literally, a 10TB data set is occupying 3TB). And for deployments like VDI, the savings are in the extreme range.

EMC’s stance was to, at a minimum, ridicule said guarantees. The inimitable Barry Burke (the storage anarchist) had this pretty funny post.

Chuck Hollis has been far more polemic about this – the worst was when he said he’d quit if EMC tried to do something similar (see here in the comments). BTW – we are all waiting for that resignation :) (on a more serious note, Chuck, if you don’t resign because of this, at least refrain from promising next time).

He also called other guarantees “shenanigans” here. I guess he’s really against the idea of guarantees.

But now it’s all good – you see, EMC is offering a blanket 20% efficiency guarantee versus the competition! I.e. – they will be able to provide 20% more actual usable storage or else they’ll give you free drives to cover the difference. You see, this guarantee is real, not like all the other companies offer :)

Kidding aside, methinks they’re missing the point – this (to go back to my favorite car analogies) is like saying: “Both our car and your car have a 3-liter engine, but yours has twin turbos and a racing intercooler and 3 times the horsepower – but we won’t take any of that into account, we will strictly examine whether you indeed have a 3-liter engine, and we’ll bore ours out to make it 3.6 liters for free”. Alrighty then. I’ll keep my turbos. But how will they deal with an existing NetApp customer that’s getting something like 3x efficiency already?

If a NetApp customer is getting 3x the usable storage due to deduplication and other means, will EMC come up with the difference or will they just make sure they offer 20% more raw storage? 

To the customer, all that matters is how much effective storage they’re able to use, not how much raw storage is in the box.

But, still, this is not what this post is about.

Throughout the years, NetApp and other vendors have offered true innovation on different fronts. Each time that happens, EMC (that also innovates - through acquisition mostly - but likes to act as if nobody else does) employs their usual “minimize and divert” technique. Either they will trivialize the innovation (“who’d want to do that?”) or they will proclaim it false, then divert attention to something they already do (or will do in a few years).

This is even the case for technologies EMC eventually acquired, like Data Domain. Before EMC acquired Data Domain, they disparaged the product, claimed it was the worst kind of device you’d ever want in your datacenter, then tried to sell you the execrable DL3D (AKA Quantum DXi – don’t get me started, the first release was an utter mess).

We all know what happened to that story eventually… at the moment, EMC is offering to swap out existing DL3Ds for free in many cases, and put Data Domain in their place since it’s infinitely better. But wait, weren’t they saying how terrible Data Domain was compared to DL3D?

Some will say this is fine since they’re just trying to compete, and “all is fair”. Personally, if I were approached by sales teams with those about-face tactics, I’d be annoyed.

So, without further ado, I present you with a slide a colleague created. Some of the timing may be a bit off, but the gist should be fairly clear… :)

image

I could have added a few more lines (Flash Cache, for instance) but it would have made for too busy a slide.

EDIT: I’ll add something I posted as a comment on someone else’s blog that I think is germane.

Since, to provide apples-to-apples protection, EMC HAS to be configured with RAID6, where are the public benchmarks showing EMC RAID6? As you well know, ALL NetApp benchmarks (SPEC, SPC) are with RAID-DP. Any EMC benchmarks around are with RAID10.

Maybe another guarantee is needed:

Provide no worse protection, functionality, space and performance than X competitor.

Otherwise, you’re only tackling a relatively unimportant part of the big picture.

D

Technorati Tags: ,,,,,,,,,,

Fri
7
May '10

NetApp usable space – beyond the FUD

I come across all kinds of FUD, and some of the most ridiculous claims against NetApp regard usable space. I won’t post screenshots from competitive docs since who knows who’ll complain, but suffice it to say that one of the usual strategies against NetApp is to claim the system has something like well under 50% space efficiency using a variety of calculations, anecdotes and obsolete information. In one case, 34% usable space :) Right…

The purpose of this post is to outline the state of the art regarding NetApp usable space as of Spring of 2010. 

Since NetApp systems can use free space in various ways instead of just for LUNs, there is frequent confusion regarding what each space-related parameter means, and what the best practices are. NetApp’s recommendations have changed over the years as the technology matured – my goal is to bring everybody up to speed.

Executive summary

Depending on the number and type of drives and the design, aside from edge cases dealing with small systems with a very low number of disks, the real usable space in NetApp systems can easily exceed 75% of the real usable space in the drives. I’ve seen it as high as about 78%. That’s amazingly efficient for something with double-parity protection as default and includes spares. This number is the same whether it represents NAS or SAN data and doesn’t include deduplication, compression or space-efficient clones, which could inflate it to over 1000%. Indeed, NetApp systems are used in the biggest storage installations on the planet partly because they’re so space-efficient. Now, on to the details…

What’s space good for anyway?

Legacy arrays use space in very simple terms – you create RAID groups, then you create LUNs on them and those LUNs pretend they’re normal disks, and that’s that. Figuring out where your space goes is easy – there’s a 1:1 relationship between LUN size and space used on the array. You buy an array that can provide 10TB after RAID and spares, and that’s all you ever get – nothing more, nothing less.

Legacy arrays can sometimes use features such as snapshots, but frequently there are so many caveats around their use (performance being a big one) that either they’re never implemented, or their number is very small indeed to make them really useful.

Since NetApp gear doesn’t suffer from those limitations, customers invariably end up using snapshots a lot, and for various reasons, not just backup. I have customers with over 10,000 snapshots in their arrays… they replicate all those snapshots to another array, can retrieve data that’s several months old, and have stopped relying on legacy backup software, saving money and achieving far faster and easier DR in the process, since with snapshots there’s no restore needed.

What’s your effective space with NetApp gear?

If you consider that each snapshot looks like a complete copy of your data, without factoring in any deduplication at all, the effective logical space could be many, many times more than the physical space. A large law firm I deal with manages to fit about 2.5PB of data into 8TB of snapshot delta space – which is pretty efficient by anyone’s standards. We’re not talking about backups done on deduplicated disk here that need to be restored to become useful – we’re talking about many thousands of straight-up, application-consistent, full copies of LUNs, CIFS and NFS shares that you can mount at full speed instantly, without needing to restore from another medium or backup application.

Once you add deduplication and thin cloning, the storage efficiency goes even higher.

It’s not the size of your disk that matters, it’s how you use it

If you use a NetApp system like a legacy disk array, without taking advantage of any of the advanced features (maybe you just care for the multi-protocol functionality, with great performance and reliability) then your usable space falls right within norms. Once you start using the advanced snapshot features, they start eating space of course – but giving you something in return. What you need to figure out is if the tradeoffs are worth it: for instance, if I can keep a month’s worth of Exchange backups with a nominal capacity increase, what is that worth for me? Maybe:

  • I can eliminate backup software licenses
  • I can shrink my storage footprint
  • Avoid purchasing external disk for backups
  • I don’t need to buy external CDP hardware/software and a bunch of extra disk
  • My restores take seconds
  • DR becomes trivial

Or, if I can create 150 clones of my SQL database that my developers can simultaneously use and only chew up a small fraction of the space I’d otherwise need, what is that worth? With other systems, I’d need 150x the space…

Or, create thousands of VM clones for VDI…

How much money are you saving?

What do simplicity and speed mean to your business from an OpEx savings standpoint?

Another way to look at it:

How much more efficient would your business be if you weren’t hampered by the limitations of legacy technology? It’s all about becoming aware of the expanded possibilities.

What you buy

FYI, and to clear any misconceptions in case you can’t be bothered to read the rest: if you ask me for a 10TB usable system, you’ll get a system that will truly provide 10TB usable, honest-to-goodness Base2 space protected against dual-drive failure (no RAID5), and after all overheads, spares etc. have been taken out. If you want snapshot space we’ll have to add some (like you’d need to with any other vendor). It’s as simple as that.

Right-sized, real space vs raw capacity

Others have explained some of this before but, for completion, I’ll take a stab:

  • The real usable size of, say, a 450GB drive is not really 450GB regardless of the manufacturer.
  • The real usable capacity quoted depends on whether it’s Base2 or Base10 math and a bunch of other factors…
  • All vendors that source drives from multiple manufacturers that use RAID groups need to right-size their drives – meaning that, if manufacturer A offers a tad more space in the drive than manufacturer B, in order to use both kinds of drives in the same RAID group, you kinda need to make them seem like the exact same size, meaning you go for the lowest common denominator.
  • Using our 450GB example above, the real addressable right-sized Base10 space in that drive is 438.3GB, and even “less” in Base2 (402.2). Base2 math simply means 1024 bytes in 1K, not 1000, and the rest follows.
  • Beware of analysis, comparisons or quotes showing Base10 from one vendor and Base2 from another, or raw disk space from one vendor vs right-sized from another! Always ask what base is what you’re seeing and whether the numbers reflect right-sized drives! If you look at the right-sized drive Base2 space from various vendors, it’s usually pretty close. Base your % usable calculations on that number and not the marketing 450GB number that’s not real for any vendor anyway.
  • Everyone pretty much buys the same drives from the same drive manufacturers…

Some space reservation axioms

Any system that allows snapshots, clones etc. typically needs some space for those advanced operations. For instance, if you completely fill up a system and then want to take a snapshot, it may let you but if you modify any data then it won’t have space to store the writes and the snapshot will be invalidated and deleted – kinda pointless.

As usual, there is no magic. If you expect to be able to store multiple snapshots, the system needs space to store the data changed between snapshots, regardless of array vendor!

And, out of curiosity – how many man-made devices do you own that you max out all the time? Not leaving breathing room is a recipe for trouble for any piece of equipment.

Explanation of the NetApp data organization

For the uninitiated, here’s a hierarchical list of NetApp structures:

  1. Disks
  2. RAID groups – made of multiple disks. Default RAID is RAID-DP. The system automatically makes them, you don’t need to define them or worry about back-end balancing etc. NetApp RAID groups are typically large, 16 disks or so. RAID-DP ensures better protection than RAID10 (the math shows 163x better than RAID10 and 4,000 better than RAID5).
  3. Parity drives – drives containing extra information that can be used to rebuild data. RAID-DP uses 2 parity drives per RAID group.
  4. Spares – drives that can replace failed or failing drives (no need to wait until the drive is truly dead)
  5. Aggregates – a collection of RAID groups and the basic unit from which space is allocated. That’s really what you define, then the system figures out automatically how to allocate disks and create RAID groups for you (can even expand RAID groups on the fly as you add more disks to the aggregate, even 1 disk at a time).
  6. Volumes – a container that takes space from an Aggregate. A volume can be NAS or SAN. A volume can only belong to one Aggregate, and there will typically be many volumes within an Aggregate. Most people will enable the automatic growing of Volumes.
  7. LUNs – they are placed inside the Volumes. One or more per volume, depending on what you’re trying to do. Usually one.
  8. Snapshots – logical, space-efficient copies of either entire Volumes or structures within volumes. There are 3 kinds depending on what you’re trying to do (Snapshot, Snapvault and Flexclone) but they all use similar underlying technology. I might get into the differences in a future post. Briefly: Snapshot – shorter term, Snapvault – longer term, Flexclone – writeable Snapshot.

Explanation of the NetApp space allocations

  1. Snapshot Reserve – an accounting feature that sets aside a logical percentage of space on a Volume. For instance, if you create a 10TB volume and set a 10% Snap Reserve, the client system will see 9TB usable. Most people will enable automatic deletion of Snapshots. The percentage to set aside is at your discretion and is variable on the fly. The actual amount of space consumed is related to your rate of change between snapshots. See here for some real averages across thousands of systems.
  2. Aggregate Snap Reserve – this is pretty unique. One can actually roll back an entire Aggregate on a NetApp system – can come in handy if you accidentally deleted whole Volumes or in general did some gigantic boo-boo. Rolling back the entire Aggregate will undo whatever was done to that aggregate to break it! This feature is enabled by default and has a 5% reservation. It it not mandatory unless you are running Syncmirror (mostly in Metrocluster setups). Depending on what you want to do, you could disable this altogether or set it to a small number like 1% (my recommendation).
  3. Fractional Reserve – The one that confuses everyone. In a nutshell: it’s a legacy safety net in case you want to modify all the data within a LUN yet still keep the snapshots. Think about it: Let’s say you took a snapshot and you then went ahead and modified every single block of your data. Your snap delta would balloon to the total size of the LUN – regardless of whether you use NetApp, EMC, XIV, Compellent, 3Par, HDS, HP etc etc. The data has to go someplace! There’s a great explanation in this document and I suggest you read it since it covers quite a bit more, too. This one is great, too. Long story short: With snapshot autodelete, and/or volume autogrow, you can set it to zero. If you use the SnapManager products, they take care of snapshot deletion themselves.
  4. System reserve – this is the only one that’s not optional. It’s set to 10% by default. You can actually change it but I’m not telling you how. That space is there for a reason, and changing it will potentially cause problems with high write rate environments. That 10% is used for various operations and has been found to be a good percentage to maintain good performance. All NetApp sizing takes this into account. BTW – ask other vendors if it’s perfectly safe to fill their systems at 100% all the time and whether that impacts performance or prevents them from being able to do certain things… And finally, that 10% lost is gained back in spades with the other NetApp efficiency methodologies (starting at the low level with RAID-DP – please do some simple math based on our 16+ drive RAID group vs typical RAID group sizes) so it doesn’t even matter.

Bottom line: Aside from the 10% system reserve, the rest is all usable space.

The NetApp defaults and some advice

So, here’s where it can get interesting (and confusing) and where the competition gets all their ammunition. Depending on the age of the documentation and firmware, different best practices and defaults apply.

So, if you look at competitive docs from other vendors, they claim that if you use NetApp for LUNs you waste double the space for fractional reserve. That recommendation was true many years ago and it was a safety precaution regarding fractional reserve. The documentation has been updated years ago with zero fractional reserve as the recommendation, but of course that doesn’t help competitors so they left the old messaging. So here’s a basic list of quick recommendations for LUNs:

  1. Snap reserve – 0
  2. Fractional reserve – 0
  3. Snap autodelete on (unless you have SnapManager products managing the snap deletion)
  4. Volume autogrow on
  5. Leave at least a little space available in your volumes, don’t let a LUN 100% fill a volume (the LUN space can be thick but the volume space can be thin-provisioned). This space is needed for deduplication and other processes temporarily
  6. Do consider embracing thin provisioning, even if you don’t want to oversubscribe your disk. It’s much more flexible long-term, and allows for storage elasticity.

So, look at the defaults and ask your engineer if it’s OK to change them if they don’t agree with the settings above. Especially on older systems, I notice that the fractional reserve is still 100%, even after getting updated with the latest software (the update doesn’t change your config). Nothing like giving someone a bunch of disk space back with a few clicks…

If you want to do thin provisioning, depending on the firmware, you may see that using thin provisioning on a volume forces the fractional reserve to 100% – but, ultimately, no real space is being consumed. Was OK in 7.2x, changed to the 100% behavior in 7.3.1, fixed in 7.3.3 since it was confusing everyone.

The bottom line

Ultimately, I want you thinking of how you can use your storage as a resource that enables you to do more than just storing your LUNs. And, finally, I wanted to dispel notions that NetApp storage has less storage efficiency than legacy systems. Comments are always appreciated!

D

Tue
27
Apr '10

What exactly is Unified Storage and who can sell it to you?

It’s come to my attention that pretty much every storage manufacturer is trying to imitate NetApp’s thought leadership and keeps announcing “Unified Storage” products. Everyone can do it now, it seems :)

Now, this post is not going to be bashing them or claiming they don’t work.

This post is about arguing what “Unified Storage” really means. And, more importantly, whether you should care about the differences.

Now, NetApp has been shipping Unified Storage for 8+ years now, and has shipped 150,000 Unified Storage systems to date. See here and here. So, I’d think nobody can argue that NetApp has quite a bit of experience in the technology and, indeed, were the very first to do it. Depending on your definition of “Unified”, NetApp may still be the only one doing it, but read on.

The crazy success of NetApp’s Unified Storage (just look at the company’s growth) has forced the other vendors, who initially dismissed the concept, to take a harder look – imagine that, customers actually like the idea of a Unified Storage System!

Here’s how most (if not all) other vendors approach ”Unified Storage”:

  • Start with your legacy Fiber Channel Array, use that to serve FC and maybe iSCSI – it’s probably a decent box, no reason to re-invent the wheel.
  • Connect some kind of Windows, Linux or UNIX server(s) to it that will then serve CIFS and NFS and maybe iSCSI (this is the NAS part)
  • Replicate them using different mechanisms for the FC and NAS parts

Pretty simple, really. You end up with the base legacy array, plus more boxes on top (ideally 2+ to ensure redundancy, plus some of them need an extra box or two called a “Control Station”).

It all works – after all, it’s just like putting servers in front of your storage, you’re doing that anyway. You are able to serve FC, iSCSI, NFS and CIFS out of the same rack. If we assume that the rack is the termination point for the cables and that you don’t care much about exactly what happens within. So, most C-level execs are OK with it – the rack can serve out all those protocols, ergo the “Unified Storage” claim seems justified.

Here are some potentially business-impacting issues with this approach:

  1. Aside from a couple of exceptions, the add-on boxes used by the storage vendors to add the NAS protocols aren’t made by that vendor (neither the OS nor the hardware). Obviously that raises some concerns with interoperability, manageability and the longevity of whatever NAS vendor was chosen. Support is now maybe not as robust since you are relying on using tech someone licensed from someone else.
  2. Replication gets complicated since you need to do it a few different ways depending on what protocol you’re replicating.
  3. Patching is more time-consuming since, apart from the legacy array, you need to also patch all the NAS paraphernalia.
  4. Management is frequently totally separate and laborious – you take care of the legacy array separately from the NAS part
  5. Certain important features are only available to one part of the solution (file-level single-instancing/”dedupe”, for example, only available for CIFS and NFS and not for iSCSI or FC).
  6. And, finally, what I think is the biggest problem: Space allocation is split between the FC and NAS parts and you can’t reduce one to increase the other. For instance, if you started with a 50/50 split, once you’ve allocated the space to the NAS (that always has its own Volume Manager and now owns that 50% chunk of array space), and you realize you’re only using 10% of that space after all, you can’t go ahead and return the remainder of the space to the FC part. This can cause serious inefficiency, inflexibility, cost and manageability issues.

The NetApp approach

NetApp decided to do things a bit differently. Maybe by virtue of how the original systems started out, it turned out it was easier for NetApp to effectively create what is effectively a protocol engine. Maybe “Protocol Engine with Integrated Disk Control and Protection” is more appropriate than “Unified Storage” but it’s a bit wordy…

Effectively, a single NetApp box, without external hangers-on, allows you to:

  • Connect using a variety of methods – FC, 1GbE, 10GbE, FCoE
  • Use the proprietary NetApp RAID-DP protection for great performance and better protection than RAID10
  • Provision FC, iSCSI, CIFS and NFS out of the same pool of physical disk space
  • Deduplicate FC, iSCSI, CIFS and NFS workloads
  • Perform application-aware replication regardless of protocol
  • Take application-aware snapshots regardless of protocol
  • Clone VMs, DBs and indeed, anything you like, without chewing up space and without impacting performance
  • Virtualize legacy arrays and impart on them the NetApp features
  • Perform workload and cache prioritization
  • Auto-migrate hot blocks to large flash cache to increase speeds (at a super-efficient 4K granularity)

As you can see, everything happens within one system, there’s no separate RAID controller or NAS box or replication box. And, like it or not, that’s a pretty impressive list of capabilities.

The potential business benefits with a true Unified Storage system

  1. Single product – you’re not relying on the marriage of completely different boxes.
  2. Better reliability, less things to break.
  3. Better support – no finger-pointing, it’s a single system from a single company.
  4. Consistent replication – one way to replicate things, yet still application-aware for 100% recoverability, improved CapEx and OpEx.
  5. Management simplicity – lower OpEx.
  6. All performance-enhancing and efficiency features are available to all protocols – Improved CapEx.
  7. There’s no dichotomy between FC, iSCSI and NAS space – allocations are fluid – Improved CapEx and OpEx.
  8. Protect your existing investment by virtualizing existing legacy disk arrays – improved CapEx and OpEx.
  9. Overall lower OpEx and CapEx – in addition to the significant space-saving features (avoid purchasing as much storage long-term), there’s significant cost avoidance since you potentially don’t need to purchase: Backup software, deduplication appliances, replication appliances, fileservers, OS licenses…

So, should you care how “Unified Storage” is architected?

Beyond the philosophical debate (one box vs multiple), given what you read, what do you think? I believe that the multi-box approach has some inherent drawbacks that are difficult to overcome. Comments welcome as always.

D

Thu
18
Mar '10

FUD and The Invention of Lying

I watched “The Invention of Lying” movie the other day. Fairly entertaining, and it had an interesting concept:

Imagine a society where nobody can lie – the very concept of lying is alien and never even enters anyone’s mind. Obviously, tons of jokes can be made using that premise, and the movie is riddled with them – such as their fictional Pepsi ad: “Pepsi: when they’re out of Coke!”

In the movie, a single man stumbles upon the concept of lying, and realizes he can do whatever he wishes since nobody else can tell he’s lying.

Obviously, in our society lying is quite prevalent – a large percentage of the population wouldn’t have jobs or offspring without lying.

I thought – what if, just for fun, we applied “The Invention of Lying” movie concept to IT sales? (I guess this is another take on comparing vendors to cars or wines and whatnot). I’m going for an alphabetical, non-comprehensive list (and added a few non-storage entries). I’ll leave it to the reader to figure out if this is more accurate from the standpoint of a rep that cannot lie, or vice versa… :)

  • 3Par: Our best asset is Marc Farley, his highly entertaining blog is what sells our gear. Our gear is pretty fast, though the software not as good as others’. Unsure how we are still in business. Also unsure why nobody has bought us yet. We do have a handful of very large, loyal customers.
  • Apple: Our stuff is prettier but inside it’s all the same, actually often slower than others. Oh, and it’s a lot more expensive. But the software is cool (when you can find it). You’ll probably need to run Windows in a VM anyway to get the full functionality. Did we mention our stuff is prettier?
  • Bluearc: We have limited-functionality NAS with good sequential and random read speeds but not so much for random writes. Oh, and no application integration. But it’s good for certain workloads. Why is nobody acquiring us?
  • Compellent: Data Progression is the coolest thing we do, and we’ll probably go under now that the big vendors can do it. Oh, and it never did much in the real world, especially for performance. Hopefully we’ll get acquired, but if our technology is that good, why did nobody acquire us yet? We’re extremely affordable!
  • Equallogic: We’ll give you free storage (the first hit is free) if/since you also buy Dell servers. We might even throw in a free laptop and a projector. And a mouse pad. Make sure you convert everything to iSCSI since that’s all we do. Oh, you wanted to know specifics about the storage? Well – it’s free! If you buy some servers. You really want to know about the storage? Well, it’s free if… What? You want to understand the failure math of RAID 50? It’s atrocious, but the box is free if…
  • EMC: We buy companies since innovating is kinda hard and time-consuming, so our solutions end up being a mish-mash of technologies. It all mostly works, though interoperability between platforms sucks. Regarding storage, you should really only buy Symmetrix since all our other stuff doesn’t even come close to that quality, we have the other boxes just to meet price points and plug portfolio holes. We trash competitors until we acquire them or until we build something good enough that’s similar. We also sell futures. Hard. We focus too much on NetApp.
  • HDS: We don’t know how to write software but our high-end gear hardware is pretty solid. The cheaper stuff is OK, severely lacks in functionality but we’ll just drop the price enough that you’ll buy it anyway. Capisce?
  • HP: Seems that buying companies works for EMC, we’ll do the same, let’s see what happens. We used to make the best calculators in the world. Oh, and our best array is actually made by HDS. Our servers are great! Please, also buy some printers, they’re pretty good.
  • IBM: We used to be some of the best in storage, now our only 2 products are SVC and DS8K (oops, and now XIV), everything else we resell after we put our faceplates on it. Our biggest sellers are products made by LSI and NetApp. Oh, and we internally compete with the XIV team we acquired. Our storage solutions don’t talk to one another since they’re all made by different people. But SVC can tie it all together! Well, some of it, anyway.
  • Intel: We are so big that even if AMD has better stuff, eventually we catch up. Just you wait. In the meantime, buy more Intel to keep us going. Resistance is futile.
  • Isilon: We are decent for bulk sequential-access NAS, just don’t do any kind of random workload on our gear.
  • LeftHand: If you want any reasonable storage efficiency plus resiliency you need to buy a bunch of boxes (5 or so), since each box is essentially an HP server with internal disks, and the whole server can die. Oh, and we only do iSCSI. So you better make sure you only do iSCSI.
  • NetApp: We probably have some of the worst marketing of all vendors, and often can’t clearly articulate what makes our systems better to C-level execs, focusing almost entirely on techies. We also have issues with making some acquisitions pan out. ONTAP 8 is taking us forever to release, and until then you won’t have very wide striping (update: GA’d 3/19/10). We complicate sales because our engineers are too technical and insist on explaining how the boxes work at a low level, frequently confusing customers, that seldom care about understanding Row-Diagonal Parity equations. Too much good information is tribal knowledge, including performance tuning and the gigantic customers we have. We focus too much on EMC.
  • Pillar: We cry ourselves to sleep because all we have is Larry Ellison and QoS. Maybe Larry will finally force Oracle to finally buy some of his^H^H^H our gear? I wonder how that will go down since Oracle is already using a superior technology and achieving great savings… but we do make a fairly fast box if you’re OK with limited functionality and RAID50.
  • Sun: We can sell you some LSI storage, but even that may be going away. You can also get the exact same storage from IBM that also resells LSI. How about a Thumper? We may also have some leftover HDS gear that we can give you real cheap.
  • Xiotech: Our value prop is extremely obscure and only understood well by about 5 engineers. Out of those 5 engineers, 2 understand the exact failure scenarios of our ISE architecture, and they can’t explain it to anyone else. We are pretty cheap though.
  • XIV: We believe in success through obfuscation. Our box can only do about 17K IOPS if the workload isn’t cache-friendly but we know how to cheat in benchmarks and make it seem faster (make sure your benchmark writes all zeros and/or fits in cache). The box also consumes more power and space than any other storage system. Our reps compete with IBM reps even though we are owned by IBM, since we only get paid on XIV sales, regardless of what the customer’s needs are. Oh, and under certain conditions, a 2-disk failure will bring down the entire system. But don’t you worry about that. BTW, the GUI is amazingly pretty.

Hope you had a chuckle reading some of this!

(minor edits – typo plus some on Twitter complained I was too gentle in the NetApp section :) )

D

Thu
18
Feb '10

So, are there any independent bloggers? Really?

There was some weird backlash against my site and my person recently – see here and here and in the comments here. Chuck Hollis got all uppity about whether I work at NetApp (with, for) or not.

I find it interesting that this only came up when I wrote something pro-NetApp. Wasn’t even anti-EMC.

It never came up when I was extolling the virtues of RecoverPoint (which I still think is awesome). I didn’t see anyone from NetApp or any EMC competitor start questioning where I worked, where the full disclosure was etc etc. Maybe they all just assumed I worked for EMC. Well – not directly, I was selling a ton of EMC gear, which was in turn paying my mortgage, which is as good as. But, ultimately, I just like the product since, properly deployed, it can solve some real problems.

So why is NetApp the company everyone loves to hate? Is it fear? Disrespect? Lack of understanding? All the above? But, I digress. NetApp customers love the product, and the company’s recent earnings announcement, as well as the fact we sold 1 Exabyte of enterprise storage last year, tells the real story. The People want their highly-functional, space-efficient, simple-to-use, application-aware storage, not 50 different products that are loosely integrated. Volkslagerung! Is that right, German-speaking readers?(edit: Volksdatenspeicher seems better as “storage for the people”).

So, I clarified things in the About page (upper left), I thought it was already clear but apparently not. Chuck is still not satisfied, so I think I’ll have to figure out a way to show some fancy animation of me in some NetApp uniform, hugging Hitz, Lau, Georgens and Mendoza and receiving my MVP award. Plus another animation showing the super-secret initiation ceremony and the extensive branding on my left buttock. Right.What was most interesting in this ad hominem attack was that the important discussion topics were largely ignored, a very efficient tactic to lure the unsuspecting reader’s mind away from the real issues.Which brings us to the subject of this post.

There seems to be this cute, romantic notion that there is such a thing as a truly independent blogger, and if I’m not independent, then what I say is tainted.

Well – let me break it to you and disabuse you of this notion: There ain’t no such thing as an independent blogger.

We are all biased, one way or another, about everything. Our past experiences shape our biases and the automatic stories our brains will create to explain any information we are presented with.

It doesn’t matter whether we work for a storage vendor or are customers – indeed, customers are typically among the most biased IT folks around! (storage vendor employees are usually crusty, jaded, cynical, have been around the block and typically have the dirt on multiple technologies).

I’ve been in customer meetings where I was told the customer doesn’t ever want to talk to EMC again because they treated him badly 10 years ago, or that he doesn’t want to talk to NetApp because he read in Barry’s blog that it only has 30% usable space, another that has FC queuing issues with HDS gear and wants to get rid of it at all costs, yet another that has had some controller panics with IBM gear and wants to get off of that and never touch IBM ever again, the list goes on. Those guys become zealots.

Then you have the other customer type, the one that receives Rolexes and other cool gifts in order to say whatever he’s told to say. Some actually will demand it (I’ve been in one of those meetings, too – “if you give me your watch we may have a deal”. I chose to assume he was kidding, lest I completely lose my faith in mankind).

You then have your “analyst” type that’s an independent industry “expert” – most of those guys haven’t touched the products they’re writing about, ever, and are just rehashing whatever they read in other publications or are told by their vendor drinking buddy. Yet they’re among the most trusted and read. They, too have their personal favorite horses they’re backing…

Finally you have your VAR bloggers. People – those guys make money selling the stuff. Yes, they know the tech, but don’t exactly expect an impartial discussion… plus, they get all kinds of incentives from vendors.

So, who do you trust, when you can’t even trust yourself? Since, by definition, you are also biased, gentle reader…

I wish I could tell you. Ultimately, everyone has an agenda, whether conscious or subconscious. You just need to become shrewd enough to see through the agenda.

Maybe a good starting point is a truly intelligent, fact-based discussion bereft of ad hominem attacks?

D

Wed
10
Feb '10

More FUD busting: Deduplication – is variable-block better than fixed-block, and should you care?

Before all the variable-block aficionados go up in arms, I freely admit variable-block deduplication may overall squeeze more dedupe out of your data.

I won’t go into a laborious explanation of variable vs fixed, but, in a nutshell, fixed-block deduplication means that data is split into equal chunks, each chunk given a signature, compared to a DB and the common chunks are not stored.

Variable-block basically means the chunk size is variable, with more intelligent algorithms also having a sliding window, so that even if the content in a file is shifted, the commonality will still be discovered.

With that out of the way, let’s get to the FUD part of the post.

I recently had a TLA vendor tell my customer: “NetApp deduplication is fixed-block vs our variable-block, therefore far less efficient, therefore you must be sick in the head to consider buying that stuff for primary storage!”

This is a very good example of FUD that is based on accurate facts which, in addition, focuses the customer’s mind on the tech nitty-gritty and away from the big picture (that being “primary storage” in this case).

Using the argument for a pure backup solution is actually valid. But what if the customer is not just shopping for a backup solution? Or, what if, for the same money, they could have it all?

My question is: Why do we use deduplication?

At the most basic level, deduplication will reduce the amount of data stored on a medium, enabling you to buy less of said medium yet still store quite a bit of data.

So, backups were the most obvious place to deploy deduplication. Backup-to-Disk is all the rage, what if you can store more backups on target disk with less gear? That’s pretty compelling. In that space you have of course Data Domain and the Quantum DXi as the two of the more usual backup target suspects.

Another reason to deduplicate is to not only achieve more storage efficiency but also improve backup times by not even transferring over the network data that’s already been transferred. In that space there’s Avamar, PureDisk, Asigra, Evault and others.

NetApp simply came up with a few more reasons to deduplicate, not mutually exclusive with the other 2 use cases above:

  1. What if you could deduplicate your primary storage – typically the most expensive part of any storage investment – and as a result buy less?
  2. What if deduplication could actually dramatically improve your performance in some cases, while not hindering it in most cases? (the cache is deduplicated as well, more info later).
  3. What if deduplication was not limited to infrequently-accessed data but, instead, could be used for high-performance access?

For the uninitiated, NetApp is the only vendor, to date, that can offer block-level deduplication for all primary storage protocols for production data - block and file, FC, iSCSI, CIFS, NFS.

Which is a pretty big deal, as is anything useful AND exclusive.

What the FUD carefully fails to mention is that:

  1. Deduplication is free to all NetApp customers (whoever didn’t have it before can get it via a firmware upgrade for free)
  2. NetApp customers that use this free technology see primary storage savings that I’ve seen range anywhere from 10% to 95%, despite all the limitations the FUD-slingers keep mentioning
  3. It works amazingly well with virtualization and actually greatly speeds things up especially for VDI
  4. Things that would defeat NetApp dedupe will also defeat the other vendors’ dedupe (movies, compressed images, large DBs with a lot of block shuffling). There is no magic.

So, if a customer is considering a new primary storage system, like it or not, NetApp is the only game in town with deduplication across all storage protocols.

Which brings us back to whether fixed-block is less efficient than variable-block:

WHO CARES? If, even with whatever limitations it may have, NetApp dedupe can reduce your primary storage footprint by any decent percentage, you’re already ahead! Heck, even 20% savings can mean a lot of money in a large primary storage system!

Not bad for a technology given away with every NetApp system

D

Mon
8
Feb '10

NetApp disk rebuild impact on performance (or lack thereof)

Due to the craziness in the previous blog, I decided to post an actual graph showing a NetApp system I/O latency while under load and a disk rebuild. It was a bakeoff vs another large storage vendor (which NetApp won).

The test was done at a large media company with over 70,000 Exchange seats. It was with no more than 84 drives, so we’re not talking about some gigantic lab queen system (I love Farley’s term). The box was set up per best practices, with aggregate size being 28 disks in this case.

(Edited at the request of EMC’s CTO to include the performance tidbit): Over 4K IOPS were hitting each aggregate (much more than the customer needed) and the system had quite a lot of steam left in it.

There were several Exchange clusters hitting the box in parallel.

All of the testing for both vendors was conducted by Microsoft personnel for the customer.  The volume names have been removed from the graph to protect the identity of the customer:

clip_image001

Under a 53:47 read/write ratio 8K-size IOPS, a single disk was pulled.  Pretty realistic failure scenario, a disk breaks while the system is under production-level load. Plenty of writes, too, almost 50%.

Ok.  The fuzzy line around 6ms is the read latency.  At point 1 a disk was pulled and at point 2 the rebuild completed.  Read latency increased to 8ms during the rebuild, but dropped back down to 5 after the rebuild completed.  The line at less than 1 ms response time straight across the bottom is the write latency. Yes it’s that good.

So - there was a tiny bit of performance degradation for the reads but I wouldn’t say that it “killed” performance as a competitor alleged.

The rebuild time is a tad faster than 30 hours as well (look at the graph :) ) but then again the box used faster, 15K drives (and smaller, 300GB vs 500GB), so before anyone complains, it’s not apples-to-apples compared to the Demartek report.

I just wanted to illustrate a real example from a real test at a real customer using a real application, and show the real effects of drive failures in a properly-implemented RAID-DP system.

The FUD-busting will continue, stay tuned…

D

Tue
2
Feb '10

Vendor FUD-slinging – at what point should legal action be taken? And who do you believe as a customer?

I’m all for a good fight, but in the storage industry it seems that all too many creative liberties are taken when competing.

Let’s assume, for a moment, that we’re talking about the car industry instead. I like cars, and I love car analogies. So we’ll use that, and it illustrates the absurdity really well.

The competitors in this example will be BMW and Mercedes. Nobody would argue that they are two of the most prominent names in luxury cars today.

BMW has the high-performance M-series. Let’s take as an example the M6 – a 500HP performance coupe. Looks nice on paper, right?

Let’s say that Mercedes has this hypothetical new marketing campaign to discredit BMW, with the following claims (I need to, again, clarify that this campaign is entirely fictitious, and used only to illustrate my point, lest I get attacked by their lawyers):

  1. Claim the M6 doesn’t really have 500HP, but more like 200HP.
  2. Claim the M6 only does 0-60 in under 5 seconds with only 5% of the gas tank filled, a 50lb driver, downhill, with a tail wind and help from nitrous.
  3. Claim that if you fill the gas tank past 50%, performance will drop so the M6 does 0-60 in more like 30 seconds. Downhill.
  4. Claim that it breaks like clockwork past 5K miles.
  5. Claim that they have one, they tested it, and performs as they say.
  6. Claim that, since they are Mercedes, the top name in the automotive industry, you should trust them implicitly.

Imagine Mercedes, at all levels, going to market with this kind of information – official company announcements, messages from the CEO, company blogs, engineers, sales reps, dealer reps and mechanics…

Now, imagine BMW’s reaction.

How quickly do you think they’d start suing Mercedes?

How quickly would they have 10 independent authorities testing 10 different M6 cars, full of gas, in uphill courses, with overweight drivers, just to illustrate how absurd Mercedes’ claims are?

How quickly would Mercedes issue a retraction?

And, to the petrolheads among us – wouldn’t such a stunt look like Mercedes is really, really afraid of the M6? And don’t we all know better?

More to the point – do you ever see Mercedes pulling such a stunt?

Ah, but you can get away with stuff like that in the storage industry!

Unfortunately, the storage industry is rife with vendors claiming all kinds of stuff about each other. Some of it is or was true, much of it is blown all out of proportion, and some is blatant fabrication.

For instance, XIV breaking if you pull 2 disks out – as I state in a previous post, it’s possible if the right 2 drives fail within a few minutes of each other. I think it’s unacceptable, even though it’s highly unlikely to happen in real life. But I’ve seen sales campaigns against the XIV use this as the mantra, to the point that the fallacy is finally stated: “ANY 2 drive failure will bring down the system”.

Obviously this is not true and IBM can demonstrate how untrue that is. Still, it may slow down the IBM campaign.

Other fallacies are far more complicated to prove wrong, unfortunately.

An example: Pillar Data has an asinine yet highly detailed report by Demartek showing NetApp and EMC arrays having significantly lower rebuild speeds than Pillar (as if that’s the most important piece of data management, but anyway – rebuild speed hasn’t helped Pillar sales much, even if it’s true).

To anyone that knows how to configure NetApp and EMC, they’d see that the Pillar box was correctly configured, whereas the others intentionally made to look 4x worse (in the case of NetApp, they literally went against not just best practices but blatantly against system defaults in order to make it slower). However, some CIOs might read this and give credence to it, since they don’t know the details and don’t read past the first graph.

For EMC and NetApp to dispute this, they have to go to the trouble of configuring, properly, a similar system, and running similar tests, then writing a detailed and coherent response. It’s like wounding the enemy soldier instead of killing them, their squadmates have to help them out, wasting manpower. I get it – it’s effective in war. But is it legal in the business world?

Last but not least: EMC and HP, at the very least, have anti-NetApp reports, blogs, PPTs etc. that literally look just like the absurd Mercedes/BMW example above, sometimes worse. Some of it was true a long time ago (the famous FUD “2x + snap delta” space requirement for LUNs is really “1x + snap delta” and has been for years), some of it is pure fabrication (”it slows down to 5% of its original speed if you fill it up!”). See here for a good explanation.

Of course, again that’s like wounding the enemy soldiers: NetApp engineers have to go and defend their honor, show all kinds of reports, customer examples, etc etc. Even so, at some point many CIOs will just say “I trust EMC/HP, I’ve been buying their stuff forever, I’ll just keep buying it, it works”. The FUD is enough to make many people that were just about to consider something else, go running back to mama HP.

Should NetApp sue? I’ve seen some of the FUD examples and literally they are not just a bit wrong but magnificently, spectacularly, outrageously wrong. Is that slander? Tortuous interference? Simply a mistake? I’m sure some lawyer, somewhere, knows the answer. Maybe that lawyer needs to talk to some engineers and marketing people.

Let’s flip the tables:

If NetApp went ahead and simply claimed an EMC CX4-960 can only really hold 450TB, what would EMC do?

I can only imagine the insanity that would ensue.

I’ll finish with something simple from the customer standpoint:

NetApp sold 1 Exabyte of enterprise storage last year, if it was as bad as the other (obviously worried) vendors are saying, does that mean all those customers buying it by the truckload and getting all those efficiencies and performance are stupid and wasted their money?

D