IBM Throws Its Weight Behind Apache Spark And Poses A Challenge To Databricks

On June 15, IBM announced significant backing for Apache Spark, the open source framework for Hadoop-based analytics. Apache Spark facilitates the development of Hadoop-based applications that specialize in interactive analytics, real-time analytics, machine learning and stream processing. IBM intends to integrate Spark into its analytics and commerce platforms as well as the IBM Watson Health Cloud and its IBM System ML machine learning technology. Moreover, Big Blue plans to offer Spark as a Service as part of its IBM BlueMix Platform as a Service, and commit 3500 developers to work on Spark-related projects. IBM also announced plans to open a Spark Technology Center in San Francisco to facilitate the development of innovative, data-centric, intelligent applications. IBM’s support of Apache Spark represents a huge coup for Spark and startups that rely heavily on its analytics framework to build analytics applications. That said, IBM’s backing of Spark also bolsters the industry of analytics frameworks built for Hadoop more generally such as the recently open sourced DataTorrent platform that offers a production-grade alternative to Apache Spark and Apache Storm. IBM’s support for Apache Spark comes in tandem with the announcement of the general availability of the Databricks cloud platform for Apache Spark that simplifies the application of Spark to Big Data use cases. Revealed roughly a year ago, the Databricks platform supports the automation of job processes and pipelines that leverage Spark as well as the use of the popular programming language R on Spark clusters. While IBM BlueMix’s Spark offering may well compete directly with the DataBricks cloud, the larger momentum for the open source Apache Spark project has swung hugely in Apache Spark’s direction and promises to continue doing so, assuming IBM can capitalize on its early investment in Spark integration into its array of platforms and use cases. IBM’s support of Spark also serves to differentiate its cloud platform from Amazon Web Services and Microsoft as the race for differentiation in the IaaS space intensifies.