Pepperdata Announces Cluster And Job Optimization Product For Cloud-based Hadoop Clusters On Amazon EMR

Pepperdata today announces a new product that helps Amazon EMR customers optimize the performance of their cloud-based, Hadoop jobs. Pepperdata with Amazon EMR delivers enhanced analytics related to the performance of jobs running on EMR data in addition to optimizing the performance of jobs in collaboration with instruction and feedback from users. The product gives Amazon EMR users granular visibility into cluster performance in conjunction with analytics on individual jobs that leverage metrics related to CPU, memory and unused capacity as illustrated by the graphic below:


Because Pepperdata translates its analytics into enhanced performance optimization on Amazon EMR, customers benefit from decreased cloud utilization as well as enhanced job performance. Sean Suchter, CTO of Pepperdata, remarked on the significance of PepperData’s product for Amazon EMR as follows:

Amazon EMR is designed to help companies process huge amounts of data easily and cost-effectively without having to commit unnecessary resources. As customers embrace Hadoop in the cloud they need to be able to manage cost and performance without any big surprises. Pepperdata eliminates those blind spots with very granular insight into the performance of current and historical EMR runs.

Here, Suchter comments on the ability of Pepperdata’s EMR product to enable customers to manage costs for Hadoop-related cloud resources while optimizing performance. Whereas Amazon Web Services EMR clusters terminate upon the completion of a run and subsequently make it difficult for users to access performance-related data, Pepperdata’s product for Amazon EMR allows users to analyze the performance of clusters and their constituent jobs even after the cluster has terminated. As a result, teams can analyze historical data to progressively improve cluster performance by determining the optimal amount of computing resources for cloud-based Hadoop jobs. Today, Pepperdata also announces the availability of Adaptive Scaling for EMR, a product that purchases Amazon EMR instances in accordance with budget and time constraints specified by clients. All told, today’s announcements from Pepperdata represent a notable addition to the space of products specializing in both infrastructure and application optimization for cloud-based Hadoop workloads. Expect to hear more from Pepperdata as big data adoption expands and companies increasingly turn their attention from deploying Hadoop clusters and their related applications toward the task of optimizing performance both at the level of clusters as well their associated jobs and applications.