What Do SLAs Say About Cloud Downtime Expectations?
Now that that their organizations have (mostly) gotten things back on track following the most recent Amazon Web Services (AWS) outage, which began in the early morning hours on October 20th, many executives are adding up their share of the billions of dollars of financial impact estimated to result from the service disruption.
It is a good time to get a handle on the risks inherent in relying on the public cloud, as well as an opportunity to understand what kind of downtime you can expect with a public cloud provider and what recourse you may have in the event of a lengthy outage.
While losing access to your internal back office functions for a couple of days may be no big deal – a business is unlikely to suffer much damage if budget reports are a day or two late – when critical IT workloads that influence customer-facing functions go down significant repercussions are likely. Lost revenue, frustrated customers, and reputational damage are common for any organization that loses contact with its customers.
Given what is at stake for your business, the AWS service level agreements seem a little light, and quite favorable to the cloud provider’s cause. If you pay for hosting in a single availability zone AWS only commits to 99.5% uptime, which translates to up to 1 day, 18 hours, and 48 minutes of potential downtime per year.
Even if you pay the extra freight for multiple availability zones, you won’t get a full refund of your AWS charges unless uptime is less than 95%, which would have you potentially offline for 18 days and 6 hours per year, an amount of time that would likely drive many organizations out of business before they could collect the reimbursement. Refund of your AWS monthly charges is all you can expect to recover from the outage in the absence of other business disruption insurance.
Public cloud certainly has its purpose, and can be a beneficial strategy in some circumstances, but if you’re heavily reliant upon it you need to evaluate your disaster recovery strategy, as well as your options for resilient hybrid colocation that combines cloud and data center colocation.
One place to start is reading Maintaining High Availability: 9 Critical Steps to Take for Disaster Recovery Success, a Direct LTx white paper that can be downloaded here. You can also arrange for a broader conversation about hybrid colocation, uptime, and resilient-yet-budget-friendly IT infrastructure strategies by emailing us at strategy@DirectLTx.com.