EMC’s Pivotal One Attempts To Bring IT Infrastructures Of Facebook, Google and Amazon Web Services To Enterprise

This week, EMC and its subsidiary VMware revealed details of the vision behind Pivotal, its spin-off company financed in part by $105 million in capital from GE. In a webcast announcing the launch of Pivotal on Wednesday, Pivotal CEO Paul Maritz, formerly CEO of VMware from 2008 to 2012, remarked that Pivotal attempts to bring to enterprises the technology platforms that have allowed internet giants such as Facebook, Google and Amazon Web Services to efficiently operate IT infrastructures on a massive scale while concurrently demonstrating cost and performance efficiencies in application development and data analytics.

Referring specifically to Facebook, Google and Amazon Web Services, Maritz elaborated on the strengths of their IT infrastructure as follows:

If you look at the way they do IT, it is significantly different than the way enterprises do IT. Specifically, they are good at storing large amounts of data and drawing information from it in a cost-effective manner. They can develop applications very quickly. And they are good at automating routines. They used these three capabilities together to introduce new experiences and business processes that have yielded — depended on how you want to count it — a trillion dollars in market value.

According to Maritz, the internet giants are a cut above everyone else with respect to data storage, data analytics, application development and automation. Enterprises, in contrast, leverage comparatively archaic IT infrastructures marked by on premise data centers and attempts to migrate to the cloud in conjunction with meager data analytics capability and poor or non-existent IT automation and orchestration processes. As a result, the enterprise market represents an opportunity to deploy technology platforms that allow for efficient storage, data integration across disparate data sources and interactive applications with real-time responses to incoming data as Maritz notes below:

It is clear that there is a widespread need emerging for new solutions that allow customers to drive new business value by cost-effectively reasoning over large datasets, ingesting information that is rapidly arriving from multiple sources, writing applications that allow real-time reactions, and doing all of this in a cloud-independent or portable manner. The need for these solutions can be found across a wide range of industries and it is our belief that these solutions will drive the need for new platforms. Pivotal aims to be a leading provider of such a platform. We are honored to work with GE, as they seek to drive new business value in the age of the Industrial Internet.

More specifically, Pivotal will provide a platform as a service infrastructure called Pivotal One that brings the capabilities currently enjoyed by the likes of Facebook and Google to enterprises in ways that allow them to continue their transition to cloud-based IT infrastructures while concurrently enjoying all of the benefits of advanced storage, analytics and agile application development. In other words, Pivotal One marks the confluence of Big Data, Cloud, Analytics and Application Development in a bold play to commoditize the IT capabilities held by a handful of internet giants and render them available to the enterprise through a PaaS platform.

Pivotal One’s key components include the following:

Pivotal Data Fabric
A platform for data storage and analytics based on Pivotal HD, which features an enterprise-grade distribution of Apache Hadoop in addition to Pivotal HD’s HAWQ analytics platform.

Pivotal Cloud and Application Platform
An application development framework for Java for the enterprise based on Cloud Foundry and Spring.

Pivotal Expert Services
Professional services for agile application development and data analytics.

Open Source Support
Active support of open source projects such as but not limited to Spring, Cloud Foundry, RabbitMQ™, Redis, OpenChorus™.

Pivotal currently claims Groupon, EMI, and Salesforce.com among its customer base. The company already has 1250 employees and, given GE’s financing and interests, is poised to take a leadership role in the industrial internet space whereby objects such as automobiles, washers, dryers and other appliances deliver real-time data to a circuit of analytic dashboards that iteratively provide feedback, automation and control. Pivotal One also represents a nascent trend within the Platform as a Service industry whereby PaaS is increasingly evolving into an “everything as a service” platform that sits atop various IaaS infrastructures. For example, CumuLogic recently announced news of a platform that allows customers to build Amazon Web Services-like infrastructures marked by suites of IaaS, Big Data, PaaS and application development infrastructures on top of private clouds behind their enterprise firewall. EMC’s Pivotal One is expected to be generally available by the end of 2013.

Advertisement

Three Key Features Of EMC’s Hadoop Distribution, Pivotal HD

This week, EMC launched its own distribution of Hadoop under the branding Pivotal HD. Built on technology that EMC obtained through the acquisition of Greenplum in July 2010, Pivotal HD represents EMC’s next iteration on the Greenplum Unified Analytics Platform (UAP) that it launched in December 2011. The Greenplum UAP featured EMC Greenplum HD, an enterprise-grade distribution of Hadoop and Greenplum’s database for structured data. Greenplum UAP also announced Greenplum Chorus, an innovative platform for collaboration amongst data scientists in an organization leveraging Big Data. Pivotal HD, however, marks a significant new chapter in EMC’s Hadoop technology as indicated by its array of features and architectural complexity.

Like many recent Hadoop distributions and technologies, Pivotal HD integrates with SQL to facilitate its maximal usage by developers and business analysts who lack familiarity with MapReduce. But the real innovation of Pivotal HD runs deeper than its integration of SQL with Hadoop and concerns the positioning of Greenplum’s analytic engine alongside HDFS in ways that enable performance enhancements to Hadoop querying over and beyond the simple appendage of a SQL interface. Pivotal HD’s Advanced Database Services (HAWQ) allows for the delivery of a high-performance SQL engine that permits of greater SQL functionality and performance than analogous SQL interfaces such as Hive, Hadapt and Impala. Coupled with Pivotal HD’s virtualization and pluggable storage compatibility features, the platform represents a distinct moment of innovation in the Hadoop space as evinced by the following three features:

Advanced Database Services (HAWQ)
Pivotal HD’s Advanced Database Services (HAWQ) functionality brings Greenplum’s Massively Parallel Processing (MPP) functionality to Hadoop. The result means that HAWQ allows Pivotal HD users to perform complex joins, MADlib in-database analytics and transactions. Moreover, users have the luxury of leveraging virtually any BI tool on the marketplace to obtain advanced reporting and visualization of data as required. HAWQ-based SQL queries outperform Hive in terms of response time by as much as 100x according to EMC benchmarking data.

The Advanced Database Service interfaces with other components of Pivotal HD as follows:

EMC Pivotal HD

Given the recent proliferation of SQL-Hadoop interfaces throughout the industry, customers and analysts should expect more data about the comparative efficiencies of SQL-Hadoop interfaces to emerge as more and more SQL-trained analysts start using SQL to operate on data saved in HDFS.

Hadoop Virtualization Extensions
Hadoop Virtualization Extensions enable the provisioning of Hadoop clusters on VMware virtualized platforms in both public cloud and on-premise environments. HVE provides customers increased flexibility of deployment and enables the construction of high availability infrastructures for the access of Hadoop data.

Pluggable HDFS Storage
Customers can multiply their data storage options by using standard Hadoop direct attached storage in addition to EMC Isilon OneFS Scale-Out NAS Storage, the latter of which features streamlined loading, backup, replication, snapshotting and elastic scalability functionality.

Analysis
Overall, EMC’s launch into the Hadoop-distribution world represents a stunning and significant move to grab Hadoop market share from Cloudera, Hortonworks and MapR. Unlike Intel’s recently launched distribution, EMC’s Pivotal HD claims some proprietary and genuinely innovative Hadoop technology in the form of its Advanced Database Services engine and scale-out storage compatibility. Expect EMC to continue to innovate upon its core technology platform and follow the suit of the likes of Concurrent in developing tools to render Hadoop more accessible to Java-based developers in addition to SQL. What remains unclear, at this point, is the extent to which EMC will open-source its technology as it gains market share within the enterprise. For now, however, the Hadoop world has yet another significant player with cash reserves aplenty to continue to innovate on its platform and disrupt the Hadoop landscape in the process.