Why Amazon’s Cloud Computing Outage Didn’t Violate Its SLA

Amazon’s cloud computing outage on April 21 and April 22 can be interpreted in one of two ways: (1) either the outage constitutes a reflection on Amazon’s EC2 platform and its processes for disaster recovery situations; or (2) the outage represents a commentary on the state of the cloud computing industry as a whole. The outage began on Thursday and involved problems specific to Amazon’s Northern Virginia data center. Companies affected by the outage include HootSuite, FourSquare, Reddit, Quora and other start-ups such as BigDoor, Mass Relevance and Spanning Cloud Apps. Hootsuite—a dashboard that allows users to manage content on a number of websites such as Facebook, LinkedIn, Twitter and WordPress—experienced a temporary crash on Thursday that affected a large number of sites. The social news website Reddit was unavailable until noon on Thursday, April 21. BigDoor, a 20 person start-up that provides online game and rewards applications, had restored most of its services by Friday evening even though its corporate website remained down. Netflix and Recovery.gov, meanwhile, escaped the Amazon outage either unscathed or with minimal interruption.

Amazon’s EC2 platform currently has five regions: US East (Northern Virginia), US West (Northern California), EU (Ireland), Asia Pacific (Singapore), and Asia Pacific (Tokyo). Each region is composed of multiple “Availability Zones”. Customers who launch server instances in different Availability Zones can, according to Amazon Web Services’s website, “protect [their] applications from failure of a single location.” The Amazon outage underscores how EC2 customers can no longer depend on having multiple “Availability Zones” within a specific region as insurance against system downtime. Customers will need to ensure their architecture plans for duplicate copies of server instances in multiple regions.

Amazon’s SLAs commit to 99.5% system uptime for customers who have deployments in more than one availability zone within a specific region. However, the SLA guarantees only the ability to commit to connect to and provision instances. On Thursday and Friday, Amazon’s US-East customers could still connect to and provision instances, but the outage adversely affected their deployments because of problems with Amazon’s Elastic Block Storage (EBS) and Relational Database Service (RDS) platforms. EBS is a storage database and RDS provides a way of relating multiple databases that store data provisioned on an EC2 platform. Because Amazon’s problems were confined to EBS and RDS in the US East region, Amazon’s SLA for customers affected by the outage was not violated. The immediate consequence here is that Amazon EC2 customers will need to deploy copies of the same server instance in multiple regions to guarantee 100% system uptime, assuming, of course, that the wildly unlikely scenario that multiple Amazon cloud computing regions experience outages at the same time never transpires.

Anyone familiar with the cloud computing industry knows full well that Amazon, Rackspace, Microsoft and Google have all experienced glitches resulting in system downtime in the last three years. The multiple instances of system downtime across vendors points to the immaturity of the technological architecture and processes for delivering cloud computing services. Until the architecture and processes for cloud computing operational management improves, customers will need to seriously consider the costs of redundant data architectures that insure them against system downtime in comparison with the risk and costs of actual downtime.

For a non-technical summary of the technical issues specific to the outage, see Cloud Computing Today’s “Understanding Amazon Web Services’s 2011 Outage“.

Advertisements

Google plans overhaul of YouTube to compete with Netflix

Google is planning a major overhaul of YouTube that will enable it to provide streaming full length television and video content. The Mountain View search engine giant has reportedly allocated $100 million for the initiative to acquire content, finalize licensing agreements and execute the requisite technical challenges. Google plans to create channels such as “Sports” and “Drama” within YouTube containing original, professionally produced content that is proprietary to Google. According to the The Wall Street Journal, Google and YouTube executives have held meetings with Hollywood talent agencies such as Creative Artists Agency, William Morris Endeavor and International Creative Management to discuss the creation of original content. Google’s play to enter the video space is motivated by an effort to increase the amount of time spent by users on the site and thereby increase advertising revenue. Moreover, the company may decide to obtain additional revenue by offering select content to users on a paid or subscription basis. Google’s apparent decision to compete directly with Netflix signals intensified competition in the online streaming content space spearheaded by cloud computing vendors such as Amazon and Google that have the IT infrastructure to deal with the bandwidth considerations of delivering significant volumes of content to users on a daily basis. Amazon Prime, for example, offers its members access to 5000 streaming movies for an annual membership fee of $79.

Wedge Partners analyst Martin Pyykkonenn notes that Google’s plans to revamp YouTube constitute a significant threat to Netflix because of the sheer omnipresence of YouTube across virtually all online platforms. That said, Netflix has thus far proven to be an unprecedented market leader in video content acquisition as evinced by its recent finalization of a licensing agreement with Lions Gate Entertainment Corporation to stream seven seasons of the TV series “Mad Men.” So far, YouTube has been less than successful in acquiring licensing rights to longer video content. Nevertheless, the stock price of Netflix has dropped significantly over the last week. Netflix’s shares rose today by 2.52% to close at $233.92 though the share price has fallen as compared to $239.97 at the close of April 6.