Google Launches Cloud Dataproc, Managed Service For Hadoop and Spark

On Wednesday, Google launched Cloud Dataproc, a managed service for deploying Hadoop, Spark, Hive and Pig designed to simplify the operational overhead associated with managing big data. Google Cloud Dataproc boasts the ability to deploy Spark and Hadoop clusters in seconds and at a fraction of the cost seen in other IaaS providers. Customers pay only for what they use and also have the ability to deploy preemptive instances, or short lived instances designed for batch jobs and fault tolerant use cases. Cloud Dataproc integrates across the lifecycle of Google products such as Cloud Storage and Big Query and allows users to automate cluster management and the resizing of clusters such that customers can focus on data management and analytics, without burdening themselves with the challenges of deploying and managing Hadoop clusters. Priced aggressively at a penny per virtual CPU per cluster per hour, Cloud Dataproc constitutes a notable addition to Google’s burgeoning cloud and big data portfolio, particularly given its easy integration with Google Compute Engine, Google Storage and Cloud Monitoring. Notably, the product offering underscores the meteoric rise of Spark as a key component of Hadoop management and analytics and paves the way for future conversations about Google’s contribution to the evolution of Spark as a possible alternative to MapReduce within the Hadoop ecosystem per Cloudera’s One Platform Initiative.

%d bloggers like this: