Skip to Content

Disaster Recovery in the Cloud

Today I want to take a good long look at one of my favorite topics, disaster recovery, and how it relates to one of my least favorite topics, the cloud (you can see why I hate the cloud here).  For today’s purposes, when I say cloud I generally mean the hyperscaler public clouds such as AWS, Azure, and Google Cloud.  We’re going to review some disaster recovery basics, then take a deeper dive into disaster recovery in the cloud.

What is disaster recovery?

Now before we get too far down this path, let’s recap what disaster recovery is.  Disaster recovery is simply the act of restoring services and systems after a disruption occurs.

I say simple, but often it is nothing close to simple.  Business continuity and disaster recovery planning is an essential step for every organization to follow, which can get complicated.

You’ll hear the terms Recovery Time Objective (RTO) and Recovery Point Objective (RPO) thrown around quite a bit, and it boils down to how fast can I recover my data, and how old is the data I am recovering from.  These numbers are often determined by business derivers, but sometimes people just make them up in absence of proper Business Impact Analysis (BIA).

After planning is done, the disaster recovery process must be tested and optimized so that it will be successful.  Simple, right?  It is, but there are many other factors that make the task of disaster recovery exceedingly more difficult than it should be.

Disasters come in all shapes and forms, from natural disasters to human disasters (oops that rogue employee just made an error), to simply disruptive service outages from a power or network provider.  Oh, and we can’t forget my favorite disaster as of late, a ransomware attack.

If you’re looking to learn more about disaster recovery in general, then check out Google Cloud’s disaster recovery guides which provide an excellent overview.

What is disaster recovery in the cloud?

Things can get a bit nebulous when it comes to the cloud, but disaster recovery in the cloud means you are recovering from disaster to the cloud as your recovery site.  Whether you start on premises or you start within the cloud doesn’t really matter, although there are some different planning considerations for each.  The end state is the cloud in this scenario.

Why is disaster recovery important in the cloud?

Look, disaster recovery is important everywhere.  It doesn’t matter if your systems are in your data center or someone else’s datacenter (aka the cloud).  At the end of the day, we have business requirements that need to be met, no matter what happens.

If you’re already operating in the cloud disaster recovery becomes especially important, since many operate under a false sense of security.  Remember, the cloud is a shared responsibility model, not a no responsibility model.

I hate to break it to you, but the cloud is nothing special.

Is disaster recovery same as backup?

Disaster recovery is not the same as backup.  Backup simply means you have another copy of your data to use, although many times backup data is used as the basis for recovery on disaster day.

How is disaster recovery implemented in cloud computing?

Disaster recovery in cloud computing should be implemented just like it was on premises, which brings us to our first problem.  In many cases, disaster recovery in the data center has been ignored, because it has simply been less expensive to accept the risk of a disaster happening than to properly plan for a disaster.

Here’s a very quick overview of how to implement disaster recovery in cloud computing.

  1. Identify assets to be protected
  2. Determine proper RRO and RPO via a BIA 
  3. Protect assets according to RPO
  4. Create recovery plan to meet RTO
  5. Test recovery plan
  6. Continue updating and testing recovery plan

The problem with disaster recovery is traditionally it can be very time consuming, which of course, ends up costing us money at the end of the day.  At the end of the day, a cloud disaster recovery plan is not much different than a disaster recovery plan you’ve used to recover to your secondary data center in the past.

OK how about this multicloud stuff?  Will that help with DR?

Don’t even get me started on this one.  The only thing I hate more than one cloud is multi-cloud.  While it is thrown around as a concept, staggeringly few have really taken a good look at it, and I have my opinions. 

While on paper multi-cloud is fantastic, the fact of the matter is it can be cost prohibitive.  Has anyone ever looked at their bill from moving something out of their chosen public cloud?  This comes back to the results of the BIA for your applications.

However, that doesn’t mean you get a free pass on disaster recovery in the cloud, wither it be within a cloud or from your data center to the cloud.  Remember, the cloud is someone else’s data center.  You wouldn’t run your whole environment in a single rack in a single data center, would you?  Even if you remain in the public cloud, you need to have a DR plan across failure domains.  Better yet, distribute your application across failure domains to start, and keep geographical locations in mind.

Types of Cloud Disaster Recovery

When we think about disaster recovery in the context of cloud, there are two huge scenarios that come to mind.  Let’s take a look at both.

Disaster recovery from your data center to the cloud

In this case, your virtual machines live in your data center, but are recovered to the cloud.  Of course, there are many different types of clouds.  There are the hyperscaler public clouds, VMware Clouds, and smaller cloud service providers as well.

Be sure to check out my deep dive on disaster recovery for your VMware environment!

Whichever the case, there are considerations you need to make for recovering to this remote location.  First of all, you need to determine how to protect your workloads to meet your RPO and RTO, and make sure the data you are using for recovery is in your cloud of choice. 

There are many different ways to do this of course, but one popular method is to send a copy of your backups to cloud storage, so it is ready and waiting for you on disaster day.

Disaster recovery within the cloud

A cloud based disaster recovery plan may look a bit different than a traditional plan, after all, your infrastructure is already there and no one is rushing to a second data center.  Of course, cloud workloads still must be protected just as your on premises workloads were.

I can’t stress this enough, but backup capabilities matter in the cloud.  The amount of people who deploy their whole infrastructure to Amazon Web Services US-East-1 region astounds me every time there is an outage.  Yes, cloud backup is a thing, and cloud providers aren’t backing up your data for you.

This is because that the design phase of a failover environment is often skipped when it comes to the cloud because of the common misnomer that the cloud just does everything for you.

Tips for Disaster Recovery Success in the Cloud

Now that we’ve set a foundation for disaster recovery in the cloud, let’s talk about how to avoid some of these problems with my favorite tips.

Start with the Business Impact Analysis

You need to know your RPO and RTO before you create your disaster recovery plan.  Full stop.  Unfortunately, this is often easier said than done.

Pick the Right Data Protection

You need to protect your virtual servers the right way to meet those RPOs and RTOs, and there are many different combinations of technologies and methods to do this.  Keep an eye on those recovery times, faster recovery times are becoming more and more common.

Pick the Right Cloud

Pick the right cloud for the job!  Everyone has their own business requirements to meet at the end of the day, and there is no single “best” cloud or “right” cloud.  This is going to be dependent on your environment.  Often cost and the operational aspect of using the cloud are two of the most critical aspects that people use as a deciding factor.

Train Your People!

Someone has to run this thing, even if it is the cloud.  Invest in your people, from a design, architecture, and operational standpoint.  Make sure that your organization has the proper training and understanding before you start using the cloud.

Design Before You Deploy

Before you spin up a virtual server (virtual machine, instance, VM instance, whatever) in the cloud, take the time to design your deployment.  This is a crucial step many organizations skip because it is so “easy” to get started with the cloud.

Create and Update Your Documentation

Don’t forget your disaster recovery plan.  This document is key to your successful recovery, and it also is a must have when the auditors come knocking.

Test Your Recovery

It is so important to test your recovery before disaster day.  A real world disaster recovery test is the only way to accurately determine what your RTO is.  This is one area where you can gain a massive amount of efficiency with orchestration.

Cloud DR Doesn’t Have to Be Nebulous 

Look, Cloud-based DR doesn’t have to be difficult, but it still takes time and effort in up front planning for a disaster recovery exercise later.  In fact, if you are still new to the cloud it may be a bit more difficult as your organization gets used to cloud operations.