EMC today announced details of RackHD (Rack Hardware Director), an open source project available under the Apache 2.0 license that delivers hyper-scale, multi-vendor hardware and network management and orchestration. By automating the management of fleets of servers, RackHD responds to the industry need to simplify the automation of heterogeneous assemblages of serves in the modern data center. RackHD can be used in collaboration with third party orchestration platforms to more effectively manage and automate the deployment of multi-vendor hardware deployments. The release of RackHD intends to elicit support from manufacturers of servers and network devices in an effort to build an open source standard for hyper-scale management of multi-vendor deployments of servers and networking devices. As such, the project underscores EMC’s commitment to open source technologies while concomitantly illustrating the acuity of EMC’s strategic vision in understanding the industry’s need for multi-vendor infrastructure management tools that can operate at the scale of 10,000 servers. In related news, the CoprHD community announced the release of CoprHD 2.4, a storage centralization and management platform that facilitates multi-vendor storage. The new version of CoprHD 2.4 features new support for EMC Elastic Cloud Storage (ECS) and EMC XtremIO 4.0. CoprHD solves the problem of multi-vendor management for the storage industry in ways that complement EMC’s vision for hyper-scale, multi-vendor management and automation as it relates to servers and networking devices via RackHD. All told, today’s announcements represent a landmark breakthrough in hyperscale infrastructure management given that EMC has now thrown its weight behind an open source project that intends to simplify hyper-scale mult-vendor hardware and networking management. As infrastructure deployments increase in size and complexity, expect offerings such as RackHD to garner increased support from the vendor community, particularly as the hybridity and heterogeneity of the contemporary data center intensifies the need for hyper-scale, multi-vendor management and orchestration.
Puppet Labs recently announced a collaboration with EMC Corporation that renders DevOps technology from Puppet Labs more readily accessible to EMC Corporation members. As a result of the partnership, Puppet Enterprise will be available as a component of EMC’s Federation Enterprise Hybrid Cloud that delivers enterprise-grade hybrid cloud solutions that leverage public cloud solutions from vendors such as EMC Cloud Service Providers, vCloud Air and Amazon Web Services. Puppet Enterprise provides a framework for the management of infrastructure as lines of code, thereby increasing the operational agility of development and operations teams by facilitating the execution of multitudinous changes to infrastructure and application deployments. EMC Federation Hybrid Cloud customers can now rely on Puppet Enterprise to bring enhanced IT automation and change management-related consistency to their deployments. While the product integration between Puppet Enterprise and the Federation Hybrid Cloud constitutes the most critical component of this announcement, EMC and Puppet Labs have also agreed to partner to develop a DevOps readiness program to help customers accelerate their adoption of DevOps practices as well as their use of hybrid clouds. EMC customers can access Puppet Enterprise by means of the company’s service catalogue, the EMC Select Global Price List and thereby integrate Puppet Enterprise with any assemblage of EMC hardware and software. The collaboration between EMC and Puppet Labs represents a huge coup for Puppet by opening up Puppet Enterprise to EMC’s channel of customers whereas EMC, on the other hand, benefits from the feather in its cap marked by Puppet Enterprise in addition to the standardization of IT automation it brings to Federation Hybrid Cloud deployments.
Pivotal announced the acquisition of Xtreme Labs, a Toronto-based mobile development and consulting firm on Wednesday. The acquisition complements Pivotal’s cloud and big data platforms by expanding Pivotal’s mobile capabilities and extending the reach of its emerging, behemoth technology platform even further. Pivotal, recall, is a platform as a service based on the Cloud Foundry project that additionally boasts big data capabilities related to the acquisition of Greenplum by its parent company EMC. The acquisition of Xtreme Labs “aligns with Pivotal’s strategy to capitalize on the nexus of converging forces in the industry” and illustrates the seriousness of its intent to build a technology platform called Pivotal One that brings the computing power had by Amazon Web Services, Facebook and Google to the enterprise. Specifically, the acquisition of Xtreme Labs positions Pivotal to build a technology platform marked by the convergence of cloud, big data, mobile and social media applications.
In an April webcast announcing the launch of Pivotal One, Pivotal CEO Paul Maritz remarked on the divide between the IT infrastructures had by select internet giants and traditional enterprise IT. Maritz noted that Amazon Web Services, Facebook and Google excel at storing massive amounts of data, extracting actionable business intelligence from that data, rapidly developing software applications and automating routine procedures. Pivotal One intends to deliver a platform as a service that democratizes the data storage, data analytics and agile application development capabilities currently held by a handful of internet giants to enterprise IT more generally. Recently, Pivotal has made news through strategic partnerships with Piston Cloud to refine the integration of OpenStack with Cloud Foundry, and IBM to develop the governance for Cloud Foundry. Terms of the acquisition of Xtreme Labs were not disclosed although AllThingsD reports Pivotal paid $65 million in cash.
EMC and Juniper recently revealed details of updates to their Software Defined Networking (SDN) platforms and strategies.
Juniper launched a suite of products branded JunosV Contrail featuring the following components:
•The JunosV controller decouples management of the network from the hardware that undergirds the network, enabling vendors to quickly deploy network services and more effectively manage the overall network infrastructure.
•JunosV Contrail virtualizes the entire network, thereby enabling vendors to leverage a more flexible network topology in conjunction with increased network scalability.
•The platform supports both OpenStack and CloudStack.
Meanwhile, EMC revealed details of the ViPR Software-Defined Storage Platform as follows:
•The EMC ViPR Software-Defined Storage Platform allows customers to manage both a software-defined networking infrastructure and data stored within that infrastructure.
•Integration with OpenStack via Swift by means of The EMC ViPR Software-Defined Storage Platform.
•Integration with VMware’s software-defined data center environment in conjunction with APIs that interoperate with OpenStack and Microsoft.
•The EMC ViPR Controller allows customers to use their current storage platforms for existing data, while enabling the provisioning of ViPR Object Data Services for new storage platforms that have the option of leveraging Amazon S3 or HDFS APIs.
Compatibility with OpenStack marks the key point of comparison between the two SDN platforms. Other key players in the SDN space include VMware due to its acquisition of Nicira, Cisco, Midokura, Nexenta Systems and Big Switch Networks. Customers should expect the SDN space to continue to deliver wave upon wave of functionality enhancements as SDN technology matures and becomes increasingly compatible both with a range of cloud platforms from myriad vendors in addition to IT automation software and DevOps platforms.
This week, EMC and its subsidiary VMware revealed details of the vision behind Pivotal, its spin-off company financed in part by $105 million in capital from GE. In a webcast announcing the launch of Pivotal on Wednesday, Pivotal CEO Paul Maritz, formerly CEO of VMware from 2008 to 2012, remarked that Pivotal attempts to bring to enterprises the technology platforms that have allowed internet giants such as Facebook, Google and Amazon Web Services to efficiently operate IT infrastructures on a massive scale while concurrently demonstrating cost and performance efficiencies in application development and data analytics.
Referring specifically to Facebook, Google and Amazon Web Services, Maritz elaborated on the strengths of their IT infrastructure as follows:
If you look at the way they do IT, it is significantly different than the way enterprises do IT. Specifically, they are good at storing large amounts of data and drawing information from it in a cost-effective manner. They can develop applications very quickly. And they are good at automating routines. They used these three capabilities together to introduce new experiences and business processes that have yielded — depended on how you want to count it — a trillion dollars in market value.
According to Maritz, the internet giants are a cut above everyone else with respect to data storage, data analytics, application development and automation. Enterprises, in contrast, leverage comparatively archaic IT infrastructures marked by on premise data centers and attempts to migrate to the cloud in conjunction with meager data analytics capability and poor or non-existent IT automation and orchestration processes. As a result, the enterprise market represents an opportunity to deploy technology platforms that allow for efficient storage, data integration across disparate data sources and interactive applications with real-time responses to incoming data as Maritz notes below:
It is clear that there is a widespread need emerging for new solutions that allow customers to drive new business value by cost-effectively reasoning over large datasets, ingesting information that is rapidly arriving from multiple sources, writing applications that allow real-time reactions, and doing all of this in a cloud-independent or portable manner. The need for these solutions can be found across a wide range of industries and it is our belief that these solutions will drive the need for new platforms. Pivotal aims to be a leading provider of such a platform. We are honored to work with GE, as they seek to drive new business value in the age of the Industrial Internet.
More specifically, Pivotal will provide a platform as a service infrastructure called Pivotal One that brings the capabilities currently enjoyed by the likes of Facebook and Google to enterprises in ways that allow them to continue their transition to cloud-based IT infrastructures while concurrently enjoying all of the benefits of advanced storage, analytics and agile application development. In other words, Pivotal One marks the confluence of Big Data, Cloud, Analytics and Application Development in a bold play to commoditize the IT capabilities held by a handful of internet giants and render them available to the enterprise through a PaaS platform.
Pivotal One’s key components include the following:
Pivotal Data Fabric
A platform for data storage and analytics based on Pivotal HD, which features an enterprise-grade distribution of Apache Hadoop in addition to Pivotal HD’s HAWQ analytics platform.
Pivotal Cloud and Application Platform
An application development framework for Java for the enterprise based on Cloud Foundry and Spring.
Pivotal Expert Services
Professional services for agile application development and data analytics.
Open Source Support
Active support of open source projects such as but not limited to Spring, Cloud Foundry, RabbitMQ™, Redis, OpenChorus™.
Pivotal currently claims Groupon, EMI, and Salesforce.com among its customer base. The company already has 1250 employees and, given GE’s financing and interests, is poised to take a leadership role in the industrial internet space whereby objects such as automobiles, washers, dryers and other appliances deliver real-time data to a circuit of analytic dashboards that iteratively provide feedback, automation and control. Pivotal One also represents a nascent trend within the Platform as a Service industry whereby PaaS is increasingly evolving into an “everything as a service” platform that sits atop various IaaS infrastructures. For example, CumuLogic recently announced news of a platform that allows customers to build Amazon Web Services-like infrastructures marked by suites of IaaS, Big Data, PaaS and application development infrastructures on top of private clouds behind their enterprise firewall. EMC’s Pivotal One is expected to be generally available by the end of 2013.
This week, EMC launched its own distribution of Hadoop under the branding Pivotal HD. Built on technology that EMC obtained through the acquisition of Greenplum in July 2010, Pivotal HD represents EMC’s next iteration on the Greenplum Unified Analytics Platform (UAP) that it launched in December 2011. The Greenplum UAP featured EMC Greenplum HD, an enterprise-grade distribution of Hadoop and Greenplum’s database for structured data. Greenplum UAP also announced Greenplum Chorus, an innovative platform for collaboration amongst data scientists in an organization leveraging Big Data. Pivotal HD, however, marks a significant new chapter in EMC’s Hadoop technology as indicated by its array of features and architectural complexity.
Like many recent Hadoop distributions and technologies, Pivotal HD integrates with SQL to facilitate its maximal usage by developers and business analysts who lack familiarity with MapReduce. But the real innovation of Pivotal HD runs deeper than its integration of SQL with Hadoop and concerns the positioning of Greenplum’s analytic engine alongside HDFS in ways that enable performance enhancements to Hadoop querying over and beyond the simple appendage of a SQL interface. Pivotal HD’s Advanced Database Services (HAWQ) allows for the delivery of a high-performance SQL engine that permits of greater SQL functionality and performance than analogous SQL interfaces such as Hive, Hadapt and Impala. Coupled with Pivotal HD’s virtualization and pluggable storage compatibility features, the platform represents a distinct moment of innovation in the Hadoop space as evinced by the following three features:
Advanced Database Services (HAWQ)
Pivotal HD’s Advanced Database Services (HAWQ) functionality brings Greenplum’s Massively Parallel Processing (MPP) functionality to Hadoop. The result means that HAWQ allows Pivotal HD users to perform complex joins, MADlib in-database analytics and transactions. Moreover, users have the luxury of leveraging virtually any BI tool on the marketplace to obtain advanced reporting and visualization of data as required. HAWQ-based SQL queries outperform Hive in terms of response time by as much as 100x according to EMC benchmarking data.
The Advanced Database Service interfaces with other components of Pivotal HD as follows:
Given the recent proliferation of SQL-Hadoop interfaces throughout the industry, customers and analysts should expect more data about the comparative efficiencies of SQL-Hadoop interfaces to emerge as more and more SQL-trained analysts start using SQL to operate on data saved in HDFS.
Hadoop Virtualization Extensions
Hadoop Virtualization Extensions enable the provisioning of Hadoop clusters on VMware virtualized platforms in both public cloud and on-premise environments. HVE provides customers increased flexibility of deployment and enables the construction of high availability infrastructures for the access of Hadoop data.
Pluggable HDFS Storage
Customers can multiply their data storage options by using standard Hadoop direct attached storage in addition to EMC Isilon OneFS Scale-Out NAS Storage, the latter of which features streamlined loading, backup, replication, snapshotting and elastic scalability functionality.
Overall, EMC’s launch into the Hadoop-distribution world represents a stunning and significant move to grab Hadoop market share from Cloudera, Hortonworks and MapR. Unlike Intel’s recently launched distribution, EMC’s Pivotal HD claims some proprietary and genuinely innovative Hadoop technology in the form of its Advanced Database Services engine and scale-out storage compatibility. Expect EMC to continue to innovate upon its core technology platform and follow the suit of the likes of Concurrent in developing tools to render Hadoop more accessible to Java-based developers in addition to SQL. What remains unclear, at this point, is the extent to which EMC will open-source its technology as it gains market share within the enterprise. For now, however, the Hadoop world has yet another significant player with cash reserves aplenty to continue to innovate on its platform and disrupt the Hadoop landscape in the process.
On Tuesday, Oracle declared the availability of the Big Data appliance that it introduced to the world at its October conference Oracle Open World. The appliance runs on Linux and features Cloudera’s version of Apache Hadoop (CDH), Cloudera Manager for managing the Hadoop distribution, the Oracle NoSQL database as well as an open source version of R, the statistical software package. Oracle’s partnership with Cloudera in delivering its Big Data appliance goes beyond the latter’s selection as a Hadoop distributor to include assistance with customer support. Oracle plans to deliver tier one customer support while Cloudera will provide assistance with tier two and tier three customer inquiries, including those beyond the domain of Hadoop.
Oracle will run its Big Data appliance on hardware featuring 864 GB main memory, 216 CPU cores, 648 TB of raw disk storage, 40 Gb/s InfiniBand connectivity and10 Gb/s Ethernet data center connectivity. Oracle also revealed details of four connectors to its appliance with the following functionality:
• Oracle Loader for Hadoop to load massive amounts of data into the appliance by using the MapReduce parallel processing technology.
• Oracle Data Integrator Application Adapter for Hadoop which provides a graphical interface that simplifies the creation of Hadoop MapReduce programs.
• Oracle Connector R which provides users of R streamlined access to the Hadoop Distributed File System (HDFS)
• Oracle Direct Connector for Hadoop Distributed File System (ODCH), which supports the integration of Oracle’s SQL database with its Hadoop Distributed File System.
Oracle’s announcement of the availability of its Big Data appliance comes as the battle for Big Data market share takes shape in a landscape dominated by the likes of Teradata, Microsoft, IBM, HP, EMC, Informatica, MarkLogic and Karmasphere. Oracle’s selection of Cloudera as its Hadoop distributor indicates that it intends to make a serious move into the world of Big Data. For one, the partnership with Cloudera gives Oracle increased access to Cloudera’s universe of customers. Secondly, the partnership enhances the credibility of Oracle’s Big Data offering given that Cloudera represents that most prominent distributor of Apache Hadoop in the U.S.
In October, Microsoft revealed plans for a Big Data appliance featuring Hadoop for Windows Server and Azure, and Hadoop connectors for SQL Server and SQL Parallel Data Warehouse. Whereas Oracle chose Cloudera for Hadoop distribution, Microsoft partnered with Yahoo spinoff Hortonworks to integrate Hadoop with Windows Server and Windows Azure. In late November, HP provided details of Autonomy IDOL (Integrated Data Operating Layer) 10, which features the ability to process large-scale structured data sets in addition to a NoSQL interface for loading and analyzing structured and unstructured data. In December, EMC released its Greenplum Unified Analytics Platform (UAP) marked by the ability to load structured data, enterprise-grade Hadoop for analyzing structured and unstructured data and Chorus, a collaboration and productivity software tool. Bolstered by its partnership with Cloudera, Oracle is set to compete squarely with HP’s Autonomy IDOL 10, EMC’s Greenplum Chorus and IBM’s BigInsights until Microsoft’s appliance officially enters the Big Data doohyoo (土俵) qua sumo ring as well.