Inside Else Inside TEMPlate====>
 

Cloud Backup and Recovery Guide

By Jeff Vance

Learn about the major obstacles, technologies, controversies and vendors in the cloud backup market.

Like It or Not, You Will Be Backing Up to the Cloud Soon

Many organizations are still resisting cloud backups for one main reason: trust. Businesses don't trust third parties with their sensitive data. They don't trust that data will be available when they need it, and they don't necessarily trust that they'll be able to move their data from one vendor's cloud to another.

All valid points. And, frankly, all of them are beside the point.

After Superstorm Sandy, the Asian tsunami, Hurricane Katrina and the Fukushima earthquake, the age of local-only backups is over. Only fools think otherwise. Even relying on regional-only backups is a dangerous proposition, as Superstorm Sandy so starkly proved. I suppose you could still backup to tape and FedEx your tapes several states away, but that's not a terribly elegant solution to the local backup problem — nor is at a cost-effective one.

Added to all this is the fact that today's enterprise is no longer a monolithic entity housed in a single building or office park. Workers telecommute, contractors have replaced many in-house workers, and it's not uncommon for employees to access — and change — corporate data from the road. Traditional backup methods fail to meet the challenges of how knowledge workers complete their work today.

The cloud's trust issues will eventually be worked out. In fact, the top cloud providers often do a heck of a lot more to protect data than the typical enterprise.

The age of backup and recovery in the cloud is already upon us. Are you ready? This guide will help prepare you for the inevitable.

Cloud Backup Project Center

Cloud Backup Obstacles

1. Trust and security concerns

For this guide, I polled approximately 50 cloud backup experts. The number one obstacle to cloud backup that they cited, by far, was trust and security. Most emphasized trust over the more technical idea of "security." That's telling.

Security is about encryption, access control and data protection. Trust is about more.

"The biggest obstacle is psychological, and relates to trust," said Nathaniel Borenstein, Chief Scientist for Mimecast, an email management company that uses the cloud to store and archive client emails. "Although off-site backups have been common for a long time, the idea of having your primary backup off-site is still scary to some. Most cloud backup systems are designed so that your backups are entirely in the hands of a third party, off site, and that feels like a loss of control to many customers."

Can you trust that the service provider will be in business a year from now? Five years? Can you trust that you'll be able to get your data out of their system and into a rival's? Can you trust them to get compliance right?

"Because of the shared, multi-tenant nature of cloud services, service level agreements (SLAs) and accountability for data loss and outages can be problematic," said Bob Bunge, associate professor at DeVry University's College of Engineering and Information Sciences.

SLAs are a major trust issue. When something goes wrong, can you trust that the service provider will make it right?

Eelco van Beek, CEO of Jitscale, an IT management company, brought up another salient question: "Will [cloud storage providers] hand over your data to a government agency if they are asked to do so?" Some providers will decrypt and turn over your data if a court orders them to. Worrying about government intrusion doesn't mean you're up to no good. Think about all of the bad PR telcos received for the warrantless wiretaps.

Finally, if these cloud providers have access to your data, are the able to do anything with it? If you go with some freemium company, can they resell your data, or even data patterns, to advertisers and marketers?

On the other hand, if you go with a company that doesn't keep the keys and cannot decrypt your data, you better be sure you never lose those keys. Otherwise, that data is lost for good.

Security is at the heart of the trust equation, but trust is about much, much more than just having the right security tools in place.

2. Compliance

Compliance is big concern for heavily regulated industries, such as financial services and health care. However, compliance should be just as important for companies that store customer data, no matter what industry they are in. According to Symantec's latest cost of a data breach report, the average breach now costs companies $5.5 million, which includes forensic costs, fines, legal fees, etc.

Moreover, companies that experience a breach lose customers and experience increased churn in their customer base.

How will you prove to your auditors that your service provider is meeting regulatory requirements? Most providers should have some sort of compliance program in place, but you'll want to make sure to verify it. After all, it will be your company's reputation on the line if something goes wrong.

3. Resistance from IT

Rightly or wrongly, many IT pros worry that the cloud will help eliminate their jobs. If you're a storage administrator, this fear may not be that far from the truth. (Just because you're paranoid doesn't mean they aren't out to get you.) However, moving backups to the cloud could free IT administrators up from tedious maintenance and management chores, so they can turn their jobs into more strategic ones.

4. Cost

Backing up in the cloud is supposed to save you money, right? Well, it doesn’t always work out that way.

One of the main cost drivers comes from the fact that you can now save everything, so people do. Another problem is that many providers make it difficult to delete data in their clouds.

"As data is often bundled into larger blocks to improve cloud operations, how does the vendor handle the case when some of the data within a block should be deleted yet other data must be kept? It is surprising as to how few vendors do this well — some don’t delete any data in the cloud so the cloud storage bill continues to grow and grow," said Jerome Noll, director Cloud Storage marketing, Riverbed Technology.

5. Unreliability of the Internet

The public Internet poses problems for those seeking to access cloud backup solutions. The status quo for remote connectivity — setting up VPN connections — isn't sufficient. Moreover, even with compression technologies, many just won't have enough bandwidth without either setting up expensive private networks (typically through MPLS lines) or finding an alternative, such as WAN optimization.

Networking limitations, both inside and outside of the data center, mean that cloud storage isn't really the best method for disaster recovery. It's great for data protection, but when it comes time to recover that data, you'll be in for a long, slow slog.

Of course, where networking is a significant enough pain point, you can bet vendors will try to find solutions to solve it and cash in on it. In this case, technologies like Software Defined Networking (SDN) and WAN optimization could help.

Cloud Backup Technologies

Service provider models

The first thing you'll need to figure out is what type of cloud offering you prefer.

Companies that want to manage their own storage and create private clouds should choose an Infrastructure as a Service (IaaS) vendor. In this model, the provider delivers hardware, bandwidth (if it's a hosted private cloud) and the virtualization technology. End users must then provision and manage their own storage.

With the Platform as a Service (PaaS) model, the provider offers a storage platform, but does not delve into the specifics of data configuration settings and such.

Finally, if you choose a SaaS provider, or a public cloud storage provider, you will get a full backup and recovery application suite, and your data will reside in the cloud provider's data center. At least some of the responsibility for data protection, availability, security and compliance is now handed off to the service provider.

The choice in models really boils down to two key decisions. First, how much storage proficiency do you have in your IT staff? If your IT team is already understaffed and overworked, the arrows point towards SaaS solutions. Second, how critical is data ownership to you? If you must maintain complete control over your data and don't want to assume any of the risks involved with handing it off to third parties, it's probably best to go with an IaaS vendor, so you can build a private cloud for your backups.

PaaS is sort of the Goldilocks solution. Not too private, not too public, and it will come with hooks to a specific vendor (like VMware or Azure). It can be a smart choice if you plan to deploy a hybrid cloud.

Software defined storage

Software Defined Storage (SDS) is a concept that's gaining momentum with cloud storage providers. A snarky fellow on Wikipedia (and, no, I'm not being casually sexist; the editors on Wikipedia are nearly all men) argues that SDN is primarily a "marketing theme for promoting storage technologies." That's in the very first line of the entry. Clearly, SDS vendors are worried about issues other than editing Wikipedia entries.

There's a kernel of truth in there, but SDS is actually a valid concept.

The SDS concept follows closely on the heels of the Software Defined Networking (SDN) idea pioneered by startups Nicira (acquired by VMware for $1.26 billion), Vyatta (in the process of being acquired by Brocade) and Embrane.

Basically, SDS and SDN are familiar stories. It's pretty much the virtualization and cloud story applied to networking and storage. The goal is to separate the management of the bits and bytes (or packets) from the underlying hardware. With SDN, the concept (and OpenFlow is a huge part of this) could help eliminate expensive routers and switches that rely on proprietary operating systems in favor of commodity hardware turned into networking boxes with open-source OSes.

With SDS, some analysts would place any virtualized infrastructure in the SDS camp. That's probably an oversimplification. What SDS does that is different is that it turns storage into an extension of the hypervisor or operating system. Rather than having a dedicated VM or even an appliance with its own OS, you can basically turn all of your storage into essentially one big virtual hard drive.

In other words, with SDS you don't need to worry about hardware and OS compatibility on the provider side. What does it matter? The storage has been abstracted from those issues. Of course, it's never that simple, but that's the vision — and really the roadmap — for where cloud storage is heading.

This doesn't mean that startups positioning themselves as SDS vendors, such as Nexenta and ScaleIO, will be the eventual winners in the cloud backup space. Rather, as is so often the case with startups, it simply means they're pointing the way. Winners are yet to be determined.

Architectural and backend technologies

If the cloud storage market is moving towards SDS, and I believe it is, then the backend technologies aren't really the end user's problem, are they? I strongly believe that eventually most hardware will be commodity hardware, and your main choice will be among service providers (AWS, Rackspace, AT&T, etc.). If you are building your own private cloud, many of your choices will be predetermined by the IaaS vendor you go with (VMware, Eucalyptus or an OpenStack- or CloudStack-based provider).

One other thing to consider is the issue of remote versus local backups. Due to networking challenges, cloud storage is best suited to data protection, retention and archiving — for now. Disaster recovery from the cloud will be a slow, tedious process when compared to recovery from local disk or even from tape.

However, in this age of extreme weather, it's a good idea to have a backup plan for your backup plan. So while recovery from the cloud is not ideal, it should be part of your disaster planning.

Industry Controversies and Debates

What will the future of storage look like?

Will the SDS model actually win out? Or will hardware make a comeback?

After all, there are some inherent risks in a software-first storage world. If hardware becomes too commoditized, will quality suffer to the point that you just can't trust it? Sure, you can move data and VMs around at will, but will you need a carnival funhouse hall of mirrors of endless backups to ensure that crappy hardware isn't your undoing?

For my money, the high-profile acquisitions of SDN vendors points to a software-first cloud world. (I mean, that's really what cloud computing is about anyway, isn't it.) And plenty of investment money is being funneled to SDS startups too.

Of course, legacy equipment and architectures tend to linger for years and often even end up having real staying power in specialized niche areas, but the storage market looks to be moving towards SDS.

Security

Many experts cited security as the number one industry controversy. Fair enough, but I see this controversy as more of an obstacle, so please refer to the "Trust and security concerns" entry in the "Obstacles" section above.

Hybrid versus public cloud deployments

This controversy is really just an extension of the bandwidth and trust conversations from above. Trusting the cloud, and only the cloud, for backups and recovery just isn't smart for now. Most vendors will want both local backups for quick availability and fast recovery, with cloud backups used for data protection, archiving and disaster recovery.

Moreover, some companies may adopt a hybrid approach in order to backup sensitive data onsite, with less sensitive data going to the cloud.

Inline de-duplication?

Vendors have disagreements over whether to de-duplicate data inline or after the fact (post-process). Inline de-duplication reduces redundant data on the fly, but since this is one more thing sitting in the network between point A (an application) and point B (storage), it can slow down the overall backup process.

However, inline de-duplication almost forces companies to use higher CPU hardware. The result can end up being, then, that all data is stored in a de-duplicated format in the cache, which can be more efficient and allow for faster replication to the cloud.

On the other hand, post-process de-duplication eliminates the potential for yet another potential bottleneck in the network.

Should you use agents or not?

Whether or not a given backup solution should have agents really focuses on Work in Progress (WIP) files. WIP files are those that users work on pretty much every day. For a knowledge worker, this could be anything from Word files to spreadsheets to presentations. For a media company, JPEGs, PNGs, vector images and movie files could all be WIP files.

"The frequency with which WIP documents need to be backed up and retrieved are much higher than for other file types such as PSTs. It isn’t uncommon to see vendors employ an agent-based approach, wherein a lightweight software agent is deployed at the client end to continuously monitor for changes and automatically map these changes to the cloud-based copy. Also known as continuous data backup or real-time backup, this form of backup is recommended (and often employed) whilst backing up documents," said Balachandar Ganesh, research head at Credii, a software research and referral firm.

However, managing agents on a bunch of host servers can get pretty complex rather quickly. Providers of agent-less solutions, on the other hand, claim that they are easier to install and manage. In theory, at least, agent-less solutions also simplify the recovery process when you actually have to find and restore something you previously backed up.

What about rich media?

This isn't an industry debate so much as it is an internal one within enterprises. Well, that's not quite true. Most enterprises are overlooking rich media, which will eventually be problematic as more and more valuable content is contained within audio, video and other media files.

Today, most organizations don’t backup much rich media. IT managers have a hard time justifying the expense of backing up the many audio or video files their organizations create since they consume such a great deal of storage space and, potentially, bandwidth.

As cloud storage pushes down the per-GB cost of storage, and as compliance officers start realizing the risks inherent in not applying the same policies to rich media data as other types of data, expect practices to change.

Cloud Backup Vendors

(If I've missed any vendors that should be listed, please make a note in the comment field below.)

SMB and personal use providers

SMB and personal cloud providers should be on the enterprise radar — not necessarily because they may eventually target the enterprise, but rather because it's a good bet that your data is already being stored with some of these providers. Your employees are aware of them — and using them — and you should be too.

Major cloud storage providers for personal and SMB use include:

I should also note that many of the home antivirus and firewall vendors (Symantec, TrendMicro, Kaspersky, etc.) are also including cloud-based backups in their security suites.

A new cloud storage player, Bitcasa is getting attention since it lets you have 10GB of storage for free.

Enterprise cloud backup service providers:

Storage hardware and software providers that focus on cloud environments:

Software defined storage providers:

WAN optimization providers:

Jeff Vance is a Santa Monica-based writer. He's the founder of Startup50, a site devoted to emerging tech startups, and he also founded the content marketing firm, Sandstorm Media. Follow him on Twitter @JWVanc

Cloud Backup Project Center

  This article was originally published on Monday Mar 11th 2013
Home
Mobile Site | Full Site