Wait, the cloud goes down? As evidenced earlier this month, yes, yes it does. I know I noticed it when I went to stream a movie on Netflix and ended up getting an error message. Netflix wasn’t the only one impacted, when I realized I couldn’t stream a movie (I was trying to watch Hackers, by the way), SiteDown.Co also listed the following as having issues:
There is a misnomer out there, which leads people to believe the cloud is magical. The cloud is a magical entity that you can use to run a variety of applications of workloads and it works great…until it doesn’t. After the outage, Data Center Frontier took a look at what is under the covers at AWS. It contains all of the things you’re used to working with in your data center, compute, networking, storage, but at a much larger scale than you may be used to. We’re talking 100,000 server per data center type of scale. While the function of the components is what we would expect, Amazon tends to build and develop most of their technology in house, with many custom hardware and software components in use. While this works for super enterprises, it may not work for your own environment.
What matters is this: at the end of the day, your “cloud” is still located in a data center someplace, and you need to architect around that fact accordingly. Data centers have hardware. Hardware fails, no matter if you bought it or built it yourself. Data centers require software to operate the hardware. Software can have issues, whether you bought it or wrote it in house. Would you really run a business critical application in one data center? Without any protection against an outage or disaster? That’s what happens if you put your application in a single cloud, and rely on it completely.
Within Amazon itself, there are, of course ways to protect an application against an outage such as this, which affected a single region. One way would be architect around this would be to have your applications spread across regions within a cloud provider, much like you may have two data centers in two different locations. But what if something more sinister had happened? Something that occurred in multiple AWS regions at the same time? Well, in that case, you would still have the potential for failure even when attempting to architect against it within a single cloud provider.
Another option would be to use a multi cloud strategy, and, well, why wouldn’t you? If I had a dollar for every time someone mentioned they used at least a dual vendor strategy for everything, I’d be writing this article from a private island someplace. Hybrid cloud isn’t just about on premises and off premises, its about protecting your data, whenever it may be living at that time. To do this, you could have an architecture where you’re taking advantages of technologies like global load balancing to re-route traffic from the provider is down. Something like OpenStack could be set up as an abstraction layer to mange the multiple environments, enabling you to use each cloud as a separate compute environment, independent of each other.
Are any of these strategies more time, effort, and money? Of course, but what is your application worth to you? What are the business impacts of a failure? The answer to this question may be different for each application in your environment, and you may need to plan your cloud strategy on an application by application basis. Perhaps if you’re running true test and dev workloads you don’t care about an outage…but if you’re paying you development team a good sum of money, you may. Just moving a workload to the cloud isn’t going to solve all of your problems, even if it does make your life easier at the time.
The cloud isn’t magic. While moving your application to the cloud means you don’t have to deal with hardware any more, it doesn’t mean someone else doesn’t have to. Everything you’re used to having break on you still runs the cloud, so make sure you’re planning accordingly.
Melissa is an Independent Technology Analyst & Content Creator, focused on IT infrastructure and information security. She is a VMware Certified Design Expert (VCDX-236) and has spent her career focused on the full IT infrastructure stack.