On Monday, Avere Systems announced a partnership with Google that empowers customers to transfer large data sets and workloads to the Google Cloud Platform. The collaboration between Avere Systems and Google means that companies can now transfer data from NAS storage systems to the Google Cloud Platform to enjoy the benefits of the scalability and performance of the same infrastructure that powers Google search, Gmail and Google Drive. Avere FXT Edge Filers technology from Avere Systems allows customers to run storage and compute workloads both on premise and in the cloud, thereby creating a hybrid cloud infrastructure optimized for cloud bursting scenarios and compute-intensive workloads. Avere Physical FXT Edge Filers deliver NAS for on-premise, file-based applications whereas Virtual FXT Edge Filers provides a software solution that manages a high performance storage infrastructure within a cloud-based platform. The combination of Avere Physical and Virtual FXT Filers allows customers to deploy solutions on premise and in the cloud while delivering high performance and low latency for big data applications. Because of its ability to support compute-intensive workloads and massive storage requirements, Avere Edge Filer technology has enjoyed notable success within the media and entertainment industry as evinced by its usage by the visual effects studio Framestore. The ability of Avere Systems to support the massive computational and storage needs of digital media and entertainment-related use cases strongly positions Avere Systems to support the needs of organizations that need to create a compute and storage intensive hybrid cloud infrastructure in collaboration with Google Cloud Platform.
Avere Systems Partners With Google Cloud Platform To Deliver Hybrid Clouds For Compute And Storage Intensive Workloads
On Wednesday, Google announced the Beta release of Google Cloud Storage Nearline, a cloud-based storage product that transforms the economics of hot and cold storage. Whereas enterprises currently wrestle with the problem of managing frequently accessed data versus “cold” data, Google Cloud Storage Nearline renders cold data accessible within three seconds. The ability of Google Cloud Storage Nearline to access cold data means that organizations need not have separate infrastructures for managing cold and hot data but can instead leverage Google’s high performance, low cost storage solution to render historical data available within a few seconds. As a result, enterprises can serve up historical emails, audits and compliance findings, log files and data specific to decommissioned products and services with a virtually negligible time lag in comparison to hot data. Google’s product charges 1 cent per GB to store data within a framework that delivers enterprise-grade security, integration with Google Cloud Storage services in addition to the ability to collaborate with vendors such as Veritas/Symantec, Netapp, Iron Mountain and Geminare for services such as backup, encryption, deduplication, data ingestion from physical hard drives and disaster recovery as a service. In the context of the larger cloud storage landscape, Google Cloud Storage Nearline poses a direct threat to Amazon’s Glacier, a solution that is similarly priced at 1 cent per GB with a focus on cold data. Unlike Google Cloud Storage Nearline, however, Amazon Glacier requires several hours for data retrieval in contrast to three seconds. Google Cloud Storage Nearline addresses the data conundrum faced by the world today given the paradox that, whereas material objects such as garbage, newspapers and man-made products in general confront technologies for recycling and transformation, data has managed to demarcate a unique place for itself marked by freedom from outright destruction. The immunity of data to being discarded is, of course, enabled by the ever decreasing price of hardware, but Google’s intervention to render historical data available within a few seconds stands to fundamentally disrupt and transform the economics of cloud storage.
On Tuesday, Mirantis announced the integration of OpenStack with Kubernetes, the open source framework developed by Google to manage containers. The integration between OpenStack and Kubernetes enhances the portability of applications between the private cloud infrastructures typical of OpenStack and public cloud environments such as the Google Cloud Platform and Microsoft Azure that support Kubernetes. Even though Docker containers are well known for enhancing the portability of applications across infrastructures, transporting applications and workloads from private clouds to public clouds remains challenging. The availability of Kubernetes within (OpenStack) private clouds in addition to public cloud environments now renders it easier to transport containerized applications from private to public clouds and subsequently obtain a greater return on investment from deploying hybrid cloud infrastructures.
Moreover, the integration between Kubernetes and OpenStack facilitates container management on the Mirantis OpenStack platform by automating and orchestrating the management of Docker containers within an OpenStack-based IaaS infrastructure. The integration between Kubernetes and OpenStack depends on the OpenStack Application Catalog Murano, which manages the infrastructure for Kubernetes clusters and deploys the Docker application to the Kubernetes cluster. As the application and Kubernetes cluster scale, Murano manages the interplay between OpenStack compute, storage and networking resources and the application to ensure support for the infrastructure needs of the application and its attendant Kubernetes cluster. Tuesday’s announcement underscores the burgeoning power of containers, container management frameworks such as Google’s Kubernetes, the significance of OpenStack within the private cloud space as well as the increasingly urgent need for technologies that promote communication across cloud infrastructures toward the end of realizing the true potentiality of hybrid cloud environments. The integration of Kubernetes and OpenStack’s Murano will be available for preview on the Mirantis OpenStack Express platform in April 2015.
On Wednesday, Google announced the availability of PerfKit Benchmarker, an open source application for benchmarking cloud performance across a variety of cloud infrastructures. PerfKit Benchmarker tackles the notorious difficulty of obtaining metrics about cloud platforms that enable an apples to apple comparison of cloud performance and operational efficacy. PerfKit reports on metrics such as “application throughput, latency, variance and overhead” in addition to data related to the time required to provision resources. Available by means of an Apache License v2, PerfKit Benchmarker is complemented by Perfkit Explorer, a visualization platform that features dashboards and other tools that facilitate rapid comprehension of trends and the business significance of the metrics collected by PerfKit Benchmarker. In a blog post, Google pledged to keep PerfKit current with changes to the evolution of contemporary cloud infrastructures as follows:
PerfKit is a living benchmark framework, designed to evolve as cloud technology changes, always measuring the latest workloads so you can make informed decisions about what’s best for your infrastructure needs. As new design patterns, tools, and providers emerge, we’ll adapt PerfKit to keep it current. It already includes several well-known benchmarks, and covers common cloud workloads that can be executed across multiple cloud providers.
Perfkit currently supports the Google Cloud Platform in addition to Amazon Web Services and Microsoft Azure according to TechCrunch, . All told, the release of Perfkit Benchmarker constitutes a seminal moment for the cloud computing industry given the dearth of data that enable cross-vendor comparisons, metrics compilation and benchmarking. Despite the availability of platforms such as Cloud Harmony, New Relic and Splunk, few tools in the industry facilitate vendor comparisons by leveraging transparent methodologies and metrics-development practices. The key question regarding PerfKit, however, will be the degree to which its measurement practices indirectly play to the strengths of the Google Cloud Platform (GCP), although presumably the Google Cloud Platform Performance team would know better than to create a benchmarking tool that serves to cast a positive light on GCP. Moreover, Perfkit was developed in collaboration with the likes of CenturyLink, CloudHarmony, Intel, Microsoft, Rackspace and Red Hat which in and of itself suggests the cloud computing space stands poised to leverage Google’s record of innovation and quality in conjunction with “quarterly discussion on default benchmarks and settings proposed by the community” led by Stanford and MIT. Regardless, Perfkit represents an exciting moment for the technology landscape as cloud computing continues to lean in the direction of interoperability, open standards and APIs between proprietary platforms that facilitate workload sharing and an increasingly open ecosystem for application development and data sharing.
Last week, Google released the Beta version of the Google Cloud Monitoring platform. Derived from its May 2014 acquisition of Stackdriver, Google Cloud Monitoring enables users to obtain insight into the performance of Google App Engine, Google Compute Engine, Cloud Pub/Sub, and Cloud SQL. As noted in a blog post by Google’s Dan Belcher, Google Cloud Monitoring delivers integrated monitoring of infrastructure, systems, uptime, trend analysis and alerts by way of a SaaS application. In addition, Google Cloud Monitoring enables users to create aggregations of select resources for monitoring and leverage dashboards that elaborate on metrics such as latency, capacity, uptime and other performance-related metrics. The platform also enables users to configure alerts specifying the achievement of designated metrics as well as endpoint checks notifying users about the lack of availability of APIs, web servers and other “internet-facing resources.” The beta release of Google Cloud Monitoring comes after months of preparation that culminated in the ability of the Stackdriver-based cloud monitoring platform to support the needs of Amazon Web Services customers as well as Google Cloud Platform customers alike. The release also follows soon upon Google’s announcement of details of Google Cloud Trace, a Beta platform that allows users to analyze remote procedure calls (RPCs) created by a Google App Engine-based application to understand latency distributions between different RPCs and “performance bottlenecks” more generally. The larger significance of the Beta release of Google Cloud Monitoring is that it delivers a monitoring tool that can monitor both Google Cloud Platform and Amazon Web Services infrastructures, whereas Amazon’s CloudWatch, for example, is dedicated solely to monitoring the AWS platform. For now, though, the product underscores Google’s commitment to building its IaaS infrastructure as exemplified by two Beta releases within the space of the early weeks of 2015.
On Wednesday, October 1, Google slashed price for its Google Compute Engine platform by 10% for all instances. The price cut represents yet another iteration on the trend of decreasing price cuts in the IaaS space as evinced by recent price reductions from Amazon Web Services, Microsoft Azure and Google itself. In a blog post announcing the change, Urs Hölzle, Senior Vice President, Technical Infrastructure at Google, noted that decreases in price in the IaaS industry were such that “only 20% of time is spent how it should be — building new products or systems that will be platforms for growth,” thereby allowing for increased time for application development. The results of Google’s IaaS cuts are reflected below:
Google’s price cuts render it increasingly competitive against the likes of Amazon Web Services, Microsoft Azure and the increasingly vibrant community of commercial OpenStack vendors. Holze proceeded to note how Snapchat, Workiva and sponsors of the 2014 World Cup differentially leverage the Google Compute Engine Platform to simplify their infrastructure needs. Meanwhile, Google’s Sundhar Pichai, SVP of Android, Chrome and Apps, reported at Atmosphere that Google Drive now claims 240 million users, or an increase of 50 million active users from June. The bottom line here is that Google is beginning to amplify its assault on enterprise cloud computing customers by cutting prices and rolling out educational campaigns to inform users of the benefits of its cloud platform. Google has the capital and cash position to cut prices further, so Amazon Web Services will need to take pay close attention to ensure that Google does not catch it off guard with an aggressive forthcoming price cut or promotion that brings in a slew of customers which cascades into a sizeable dent in AWS IaaS market share.
Google recently announced development of Mesa, a data warehousing platform designed to collect data for its internet advertising business. Mesa delivers a distributed data warehouse that can manage petabytes of data while delivering high availability, scalability and fault tolerance. Mesa is designed to update millions of rows per second, process billions of queries and retrieve trillions of rows per day to support Google’s gargantuan data needs for its flagship search and advertising business. Google elaborated on the company’s business need for a new data warehousing platform by commenting on its evolving data management needs as follows:
Google runs an extensive advertising platform across multiple channels that serves billions of advertisements (or ads) every day to users all over the globe. Detailed information associated with each served ad, such as the targeting criteria, number of impressions and clicks, etc. are recorded and processed in real time…Advertisers gain fine-grained insights into their advertising campaign performance by interacting with a sophisticated front-end service that issues online and on-demand queries to the underlying data store…The scale and business critical nature of this data result in unique technical and operational challenges for processing, storing and querying.
Google’s advertising platform depends upon real-time data that records updates about advertising impressions and clicks in the larger context of analytics about current and potential advertising campaigns. As such, the data model requires the ability to accommodate atomic updates to advertising components that cascade throughout an entire data repository, consistency and correctness of data across datacenters and over time, the ability to support continuous updates, low latency query performance, scalability as illustrated by the ability to support petabytes of data and data transformation functionality that accommodates changes to data schemas. Mesa utilizes Google products as follows:
Mesa leverages common Google infrastructure and services, such as Colossus, BigTable and MapReduce. To achieve storage scalability and availability, data is horizontally partitioned and replicated. Updates may be applied at granularity of a single table or across many tables. To achieve consistent and repeatable updates, the underlying data is multi-versioned. To achieve update scalability, data updates are batched, assigned a new version number and periodically incorporated into Mesa. To achieve update consistency across multiple data centers, Mesa uses a distributed synchronization protocol based on Paxos.
While Mesa takes advantage of technologies from Colossus, BigTable, MapReduce and Paxos, it delivers a degree of “atomicity” and consistency lacked by its counterparts. In addition, Mesa features “a novel version management system that batches updates to achieve acceptable latencies and high throughput for updates.” All told, Mesa constitutes a disruptive innovation in the Big Data space that extends the attributes of atomicity, consistency, high throughput, low latency and scalability on the scale of trillions of rows toward the end of a “petascale data warehouse.” While speculation proliferates about the possibilities for Google to append Mesa to its Google Compute Engine offering or otherwise open-source it, the key point worth noting is that Mesa represents a qualitative shift with respect to the ability of a Big Data platform to process petabytes of data that experiences real-time flux. Whereas the cloud space is accustomed to seeing Amazon Web Services usher in breathtaking innovation after innovation, time and time again, Mesa conversely underscores Google’s continuing leadership in the Big Data space. Expect to hear more details about Mesa at the Conference on Very Large Data Bases next month in Hangzhou, China.