Today, Concurrent Inc. announces the release of Pattern, an open source tool designed to enable developers to build machine-learning applications on Hadoop by leveraging the Predictive Model Markup Lanaguage (PMML), the standard export format for popular predictive modeling tools such as R, MicroStrategy and SAS. Data scientists can use Pattern to export applications to Hadoop clusters and thereby run them against massive data sets. Pattern simplifies the process of building predictive models that operate on Hadoop clusters and lowers the barrier to the adoption of Apache Hadoop for advanced data mining and modeling use cases.
An example of a use case for Pattern includes evaluating the efficacy of models for a “predictive marketing intelligence solution” as illustrated below by Antony Arokiasamy, Senior Software Architect at AgilOne:
Pattern facilitates AgilOne to deploy a variety of advanced machine-learning algorithms for our cloud-based predictive marketing intelligence solution. As a self-service SaaS offering, Pattern allows us to evaluate multiple models and push the clients’ best models into our high performance scoring system. The PMML interface allows our advanced clients to deploy custom models.
Here, Arokiasamy remarks on the way in which Pattern facilitates scoring of predictive models that enables the selection of one model amongst others. AgilOne uses Pattern to run multiple predictive models in parallel against large data sets and additionally illustrates the efficacy of Pattern’s operation on a Hadoop cluster deployed in a cloud-based environment.
Pattern runs on the popular Cascading framework for simplifying the deployment and management of Hadoop clusters that is used by the likes of Twitter, eBay, Etsy and Razorfish. A free, open source application, Pattern constitutes yet another pillar in Concurrent’s array of applications for streamlining the use of Apache Hadoop alongside Cascading and Lingual, the ANSI-standard interface that enables developers to leverage SQL to query Hadoop clusters without having to learn MapReduce. The release of Pattern consolidates the positioning of Concurrent as a pioneer in the Big Data management space given its thought leadership in designing applications that facilitate enterprise adoption of Hadoop. Enterprises can now use Concurrent’s Cascading framework to operate on Hadoop clusters using JAVA APIs, SQL and predictive models written in PMML compatible analytics applications.