Cloudera recently announced a One Platform Initiative that aspires to make Apache Spark the default framework for processing analytics in Hadoop, ahead of MapReduce. Cloudera’s One Platform Initiative will focus on bolstering the security of Apache Spark, rendering Spark more scalable, enhancing management functionality and augmenting Spark Streaming, the Spark component that focuses on ingesting massive volumes of streaming data for use cases such as the internet of things. Cloudera’s efforts to improve the security of Apache Spark will focus on ensuring the encryption of data at rest as well as over the wire. Meanwhile, the initiative to improve the scalability of Apache Spark aims to render it scalable to as many as 10,000 nodes including enhanced ability to handle computational workloads by means of an integration with Intel’s Math Kernel Library. With respect to management, Cloudera plans to deepen Spark’s integration with YARN by creating metrics that provide insight into resource utilization as well as improvements to multi-tenant performance. Regarding Spark Streaming, Cloudera plans to render Spark Streaming more broadly available to business users via the addition of SQL semantics and the ability to support 80% of common streaming workloads.
Cloudera’s larger goal is to enhance the enterprise-readiness of Apache Spark with a view to promoting it as a viable alternative to MapReduce. All of Cloudera’s enhancements to Spark will be contributed to the Apache Spark open source project. That said, Cloudera’s leadership in stewarding the acceleration of the enterprise-readiness of Apache Spark as a MapReduce alternative promises to position it strongly as the undisputed market share and thought leader in the Hadoop distribution space, particularly given the range of its intended contributions to Spark and the depth of its vision for subsequent Spark enhancements in forthcoming months.