On Thursday, Hortonworks announced that Apache Spark is “YARN Ready” and compatible with the multiple workloads and additional CPU processing-demands specific to Spark applications. As a result of the compatibility of Apache Spark with YARN, Hadoop users can now use one Hadoop cluster with a single repository of data for a variety of purposes rather than having to segment workloads such that some data is dedicated to Apache Spark. More specifically, Hadoop users can now rest assured that YARN-based applications work collaboratively with applications that leverage Spark’s capabilities to facilitate real-time analytics, interactive analytics, machine learning and stream processing. Hortonworks introduced Apache Spark to the Hortonworks Data Platform as a technology preview download in May but today announces the integration of Spark with YARN, its recent acquisition, XA Secure, for authentication and data security purposes, as well as Ambari toward the larger goal of delivering an integrated, turnkey, enterprise-grade Hadoop platform. Thursday’s announcement by Hortonworks responds to similar statements by competitors MapR regarding the integration of Spark into its Hadoop distribution, and Cloudera’s announcement of its enterprise-grade support for Apache Spark.
The following graphic illustrating the integration of Spark into YARN originated from the Hortonworks blog post Making Apache Spark YARN Ready.
On Monday, Rackspace announced the availability of the Hortonworks Data Platform (HDP) powered by Apache Hadoop within both its managed hosting environment and public cloud infrastructure. Customers can additionally choose a hybrid approach to leveraging the Hortonworks distribution of Apache Hadoop on Rackspace’s offering by using the managed hosting offering for Hadoop hosted within a private cloud in conjunction with a Hadoop deployment on its public cloud platform. The news of the availability of HDP as part of Rackspace’s suite of offerings represents part of a broader move by the San Antonio-based company to offer databases and datastores over and beyond SQL and Oracle. Rackspace’s recent acquisition of ObjectRocket and Exceptional Cloud Services, for example, means that, in addition to Hadoop, it will be offering MongoDB as well as Redis To Go as a service in the near future as well. The integration of HDP within the Rackspace platform illustrates the phenomenon of convergence within the IT industry whereby cloud platforms are converging with Big Data platforms as both technologies become sufficiently maintstream such that customers feel comfortable experimenting with the conjunction of both cloud hosting environments and the likes of Hadoop and MongoDB. More specifically, cloud adoption appears to be accelerating Big Data adoption given that customers now have ample opportunities to experiment with cloud-based Hadoop environments without shouldering the burden of its deployment and maintenance.