When is a backup really a backup?

Chuck Hollis raised an interesting question last year, asking when is a backup really a backup. His argument is that a copy of data stored local to the production copy, isn’t really a backup. This means taking a snapshot of a document or repository, isn’t really a backup. You must move a copy of your data to another location. Another location is ambiguous enough, but anything away from where you generated the content will suffice as the most basic form of a backup.

I would add a second question of when is a backup complete? When the backup job has completed moving data from the client? When the data moved from the client has been stored on the media? When the Backup software metadata or catalog is updated? Or, as I believe, when the data is ready to be restored. Logically, this makes the most sense. When all aspects of the backup job are completed, you can recover that data back.

So, when is a backup a backup? When it is not stored with the original copy. And when is a backup complete? When that data is ready to be restored.

Enjoy Chuck’s original post below


When Is A Backup Really A Backup?

I must be getting old and crotchety.

I keep seeing various marketing spins around “local backups” and I sort of shake my head and wonder if the world is going mad, or maybe it’s just me?

See if you agree with my line of reasoning.

To Start With

Let’s say I’m working on an important document. To protect my work, I make a copy on my local storage from time to time.

This protects me from, say, my application scrambling my data, or me doing something damaging to my document. That’s good.

But it doesn’t protect me from a desktop hardware or software failure. Nor does it protect me if my desktop goes missing. It really doesn’t protect me from myself — I could just as easily inadvertently delete my “backup” copy as I could my primary copy.

I get some protection — sort of.

Better protection occurs when I put a copy of what I’m working on in a separate place. Maybe I make a copy on the file server, or I email it to myself — something that locates the data outside the confines of my desktop environment.

I get much better protection. Sure, there are scenarios where both the primary and the backup copy run into problems, but they’re far less likely to happen. And, if that worries me, I make more copies in different places — just to lower the risk to acceptable levels.

Are Local Snaps Really A Backup?

I am aware of at least three vendors in the industry promoting the idea that local snaps (e.g. a quick logical copy made on the same physical storage device) are attractive from a backup perspective.

They point to the speed and the convenience of doing so as their primary argument.

I can’t quibble with that. Making a local copy of a desktop file is fast and easy. But is it a backup?

I suppose that if losing your recovery copy would be more of an inconvenience than a crisis, then — yes — I could accept that. But there’s no nuance to their message, e.g. where a local snap makes sense, and when it doesn’t. Sort of one of those dangerous one-size-fits-all marketing messages that drive so many of us nuts.

Storage arrays can fail — both hardware and software. Storage administrators can fail as well — inadvertently deleting important snap copies that you were counting on as protection.

Stuff happens — that’s the way of the world. And — to my way of thinking — backups are supposed to protect you from stuff happening.

To Be Fair

EMC supports all manner of local snap and replication technology on our storage arrays. Indeed, I think TimeFinder (1995) was one of the first array-based local copying mechanisms in the industry. Local copies can do simply wonderful things in providing fast recoveries of previous data states in a variety of scenarios. They’re an important part of the storage toolbag these days — I can’t imagine life without them.

But I wouldn’t dare to call those local copies “backups” — as long as they were residing on the same physical device. Or, at least, without being incredibly precise as to exactly what sort of protection was being afforded, and — more importantly — what wasn’t.

And I’m sure that there are lots of admins out there who consider their local copies their “backup”, and will swear up-and-down that they’ve never had a problem — so far. That sort of argument makes me despair, e.g. “I’ve gotten away with doing this really risky stuff just because it’s convenient, and it hasn’t caught up with me yet”.

Hey, people are trusting you with their data, you know? BTW, if that’s what you’re doing, please let me know so I can make another copy somewhere else. I don’t want your bad day to become my bad day.

So, Where Do We Go From Here?

First, I think users of these technologies have to define their own terms around what they consider a real backup, and what doesn’t qualify. Don’t let vendors do that for you — trust me, some of their motivations might not be in your best interests.

Second, past performance is no guarantee of future results. That disclosure shows up when considering financial investments; the same sort of disclosure ought to apply to data protection practices. We should avoid the natural tendency to be lulled into complacency just because it’s been smooth sailing so far.

Third, maybe we need more precision around how we use terms. I’ve never used the terms “local copy” and “backup” interchangeably, but that’s exactly what some are starting to do.

Fourth, let’s understand who’s accountable here — it’s the IT administrator. From a business user perspective, if the IT guys can’t get me access to my data, or — worse — lose a bunch of it, that doesn’t look good.

And trying to defend yourself with a “well, my vendor said it’d be OK” will look like very weak sauce indeed.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.