Google recently announced details of a chip specifically built for machine learning in the form of its Tensor Processing Unit (TPU) that has operated in stealth mode within Google’s data centers for over a year. TPU delivers superior performance for machine learning use cases by processing “more operations per second into the silicon” and is designed to work with TensorFlow, Google’s open source library of machine learning applications. TPU requires fewer transistors per operation and can optimize performance per watt by an order of magnitude for machine learning applications and use cases in what amounts to “fast-forwarding technology about seven years into the future (three generations of Moore’s Law),” according to a Google blog post. TPU currently powers applications such as Google Street View, RankBrain as well as Google Maps and powerfully illustrates not only Google’s commitment to machine learning technologies but also its competitive differentiation as a vendor with the ability to design and operationalize hardware optimized for machine learning applications.
Google recently announced the integration of BigQuery, its fully managed data warehouse that allows customers to store and query petabytes of data using SQL and a utility-based pricing model, with its consumer facing Google Drive application. As a result of the integration, users can query files stored in Google Drive directly from the BigQuery user interface, without loading them into BigQuery. Moreover, users can save the results of queries on Google Sheets directly into Google Sheets, and update those queries as data within Google Sheets dynamically changes. The integration between Google BigQuery and Google Drive breaks down the barrier between the Google Cloud Platform and its Google Apps suite and correspondingly gives Google Cloud Platform customers a more seamless, integrated experience with respect to the ability to query data that resides outside of BigQuery. More importantly, the integration provides Google Drive users with an extra incentive to tap into the lightning fast SQL queries of BigQuery and explore its capabilities for querying data as a prelude to a more sustained investigation of its ability to analyze massive datasets and its impressive integrations with third parties such as Tableau, Talend and Qlik.
The Google Cloud Platform experienced a major outage marked by the loss of “external connectivity in all regions” in all regions on April 11 lasting roughly 18 minutes. Caused by a networking failure, the outage impacted all regions and represented one of the most systemic outages in the history of the public cloud particularly given how AWS and Microsoft Azure have suffered through outages specific to one or more availability zones, but never in all regions. Since the outage and an attendant post-mortem analysis of its root causes, Google claims to have resolved the issues with its network configuration software, and taking a cue from its competitor Amazon Web Services, served up a remarkably detailed post-mortem analysis of the outage and its origin, chronology, escalation pathways, resolution and near and long-term remediation. The larger point, here, however, is that despite its recent windfall of customer signings featuring the likes of Apple, Spotify and the Home Depot, the Google Cloud Platform is still in the process of ironing out the kinks in its IT infrastructure as they relate to process, technology, alerts, notifications and root cause analytics. The outage constitutes both a reflection on the evolution of the Google Cloud Platform and the continued immaturity of data-driven alerts and notifications despite the efflorescence of contemporary technologies dedicated to intelligent automation, self-optimized infrastructures and real-time analytics on streaming data. The lesson learned from Google’s recent outage is that the integration of the mitigating, data-driven checks and process automation steps designed to identify and swiftly ameliorate issues within the Google Cloud Platform, as they arise, still have yet to mature to the point where they are capable of isolating problems to a specific availability zone or cluster of availability zones in contrast to a categorical and unprecedented cascade across all regions. As such, the outage raises more questions than it does answers about the architecture undergirding Google’s Cloud Platform as well as how software configuration glitches can have unexpectedly far reaching consequences despite the surfeit of contemporary analytic capabilities available to proactively monitor the health of IT infrastructures.
Google recently announced the alpha release of Cloud Machine Learning, a managed, cloud-based framework for building machine learning models by using the TensorFlow framework that undergirds products such as Google Photos and Google Speech. Google’s Cloud Machine Learning platform features a Cloud Vision API that can categorize images into over a thousand categories such as “tree,” “book” and “car” and additionally identify “individual objects and faces within images” as well as read print within images. The platform also features a Cloud Speech API that can transcribe speech into text by using neural network models. Moreover, Google Cloud Machine Learning contains a Google Cloud Translate API that can translate source language into a supported target language, such as French to Japanese, for example. Google Cloud Machine Learning integrates with Google Cloud Dataflow in addition to data stores from Google Cloud Storage and Google BigQuery. By offering pre-trained machine learning models in conjunction with the capability to build customized models for specific scenarios and use cases, the platform delivers predictive modeling capabilities that can scale to support terabytes of data and rapidly proliferating data sources. Google Cloud Machine Learning competes with Amazon Machine Learning and Hewlett Packard Enterprise Haven On Demand, the latter of which is now commercially available on the Microsoft Azure platform. The alpha release of Google Cloud Machine Learning further illustrates Google’s investment in its Google Cloud Platform and the depth of its commitment to building an increasingly competitive position in the contemporary cloud computing market landscape.
The Google Cloud Platform has secured Home Depot as one of its clients in what amounts to a huge coup for Google’s cloud business as it continues to court the enterprise. Google also revealed Coca-Cola as one of its customers its Google Cloud Platform Next Conference in San Francisco alongside The Walt Disney Company. Former VMware CEO Diane Greene, who heads Google’s cloud business, noted that the company is “dead serious about this business” in a press conference. Greene further noted that Google has “spent billions on data centers and are going to use them as much as we can. This is a long-term, forever event.” The unveiling of Home Depot, Coca-Cola and The Walt Disney Company illustrates the success of the Google Cloud Platform in building relationships with enterprise customers under Greene and testifies to its seriousness in continuing to invest in the Google Cloud Platform infrastructure and its related applications. Wednesday’s disclosure of high profile customer signings from Google indicates that the three horse race between Amazon Web Services, Microsoft Azure and the Google Cloud Platform will only intensify in subsequent months, particularly as enterprises become increasingly comfortable migrating select workloads to public cloud environments and correspondingly develop and implement long-term cloud deployment strategies featuring partnerships with one or more of the industry’s leading cloud platforms.
Google recently revealed details of Maglev, its network load balancing technology that gives developers the ability to build infrastructures that can handle a million requests per second, without warning. Maglev leverages Equal-Cost Multi-Path routing technology to disperse incoming network packets across all Maglevs in conjunction with “consistent hashing techniques” that enable the accurate transmission of packets to the “correct service backend servers,” regardless of which Maglev receives a specific packet. Maglev’s use of Equal Cost Multi-Path routing technology differentiates from the common industry practice of using Active Passive load balancer configuration, wherein the secondary, or passive load balancer configuration operates passively and awaits the opportunity to assume responsibility for load balancing as required. Whereas Active Passive load balancer configurations can waste half of their load balancing resources, Maglev’s use of ECMP enables a more deeply engaged utilization of existing resources as noted in Google’s blog post, below:
All Maglevs in a cluster are active, performing useful work. Should one Maglev become unavailable, the other Maglevs can carry the extra traffic. This N+1 redundancy is more cost effective than the active-passive configuration of traditional hardware load balancers, because fewer resources are intentionally sitting idle at all times.
Borg, Google’s cluster management technology, renders it possible to migrate service workloads between different clusters as required. Similarly, Maglev facilitates the addition and removal of load balancing capacity and thereby illustrates the capability of Network Functional Virtualization technology to add and remove load balancing functionality without the addition of new hardware. Google’s deep dive into the workings of Maglev, which runs on commodity Linux servers, illustrates how its technology manages load balancing at scale and facilitates its management both of Google network traffic more generally as well as load balancing within the Google Cloud Platform. Named after the Japanese bullet qua magnetic levitation train, Google’s Maglev technology requires the addition of new Maglevs once a certain threshold has been reached with respect to the use of existing Maglevs for network load balancing purposes.
Google CEO Sundar Pichai recently announced that Gmail has surpassed 1 billion users per month and that its Google Cloud Platform is used by more than 4 million applications. Pichai also asserted that the Google Cloud Platform “is ready to be used at scale,” and that the company has reached a point where its cloud infrastructure and applications have reached a level of maturity at exactly the time when the broader, industry-wide “movement to cloud has reached a tipping point.” Pichai further noted that Catholic Health Initiatives, one of the nation’s largest non-profit health systems, announced its transition to Google Apps last quarter in what amounts to yet another example of the Google Cloud Platform’s readiness to embrace workloads from large organizations and enterprises. Unlike Microsoft and Amazon, Alphabet, Google’s parent company, failed to break out revenue run rate details about its subsidiary cloud business but the company’s appointment of VMware executive Diane Greene to head Google’s cloud services division in November constitutes ample proof of the company’s interest in building out its cloud business. The question now, however, is when and how Google plans to court the enterprise, which has traditionally been dominated by Microsoft and IBM in the enterprise software and infrastructure space. Without more details of its anticipated strategy for gaining traction for cloud products and services in the enterprise, investors and analysts alike will be hard pressed to understand how Google plans to build cloud market share, particularly given continued impressive revenue growth for Amazon Web Services and Microsoft’s growing ascendancy in the cloud products and services space under CEO Satya Nadella.