Concurrent Inc. today announces the finalization of $10M in Series B funding in a round led by new investor Bain Capital Ventures, with additional participation from existing investors Rembrandt Ventures and True Ventures. Salil Deshpande, Managing Director of Bain Capital Ventures, will join Concurrent Inc.’s board of directors as a result of today’s funding raise. The funding will be used to accelerate the development of Concurrent’s commercial product Driven as well as Cascading, the framework for developing and managing Big Data applications. Driven fills a critical void within the Big Data industry by providing customers with visibility regarding application performance on Hadoop while Cascading represents one of the most widely used frameworks for application development on Hadoop. Concurrent’s Series B funding raise comes hot on the heels of its elaboration of details regarding Cascading 3.0 and the announcement of partnerships with Hadoop vendor Hortonworks and Databricks. Scheduled for release in the early summer, Cascading 3.0 features support for technology platforms and computational frameworks such as local in-memory, Apache MapReduce and Apache Tez. Meanwhile, Cascading’s partnership with Hortonworks integrates the Cascading SDK into the Hortonworks Data Platform under the terms of an agreement whereby Hortonworks will certify, deliver and support the Cascading framework. Today’s funding raise provides further validation of Concurrent’s business model and empowers it to consolidate its early positioning as a leader in the Big Data space, with specializations in applications that streamline and simplify Hadoop application development and cluster management. With its new round of funding in hand, the industry expect Concurrent Inc. to obtain more traction around its flagship product Driven as it continues to innovate at the forefront of technology platforms that facilitate the effective operationalization of Big Data. Today’s Series B announcement brings the total capital raised by Concurrent Inc. to $14M by building upon a March 2013 Series A round of $4M.
Today, Concurrent, Inc. announces the release of Cascading 3.0, the latest version of the popular open source framework for developing and managing Big Data applications. Widely recognized as the de facto framework for the development of Big Data applications on platforms such as Apache Hadoop, Cascading simplifies application development by means of an abstraction framework that facilitates the execution and orchestration of jobs and processes. Compatible with all major Hadoop distributions, Cascading sits squarely at the heart of the Big Data revolution by streamlining the operationalization of Big Data applications in conjunction with Driven, a commercial product from Concurrent that provides visibility regarding application performance within a Hadoop cluster.
Today’s announcement extends Cascading to platforms and computational frameworks such as local in-memory, Apache MapReduce and Apache Tez. Going forward, Concurrent plans for Cascading 3.0 to ship with support for Apache Spark, Apache Storm and other computational frameworks by means of its customizable query planner, which allows customers to extend the operation of Cascading to compatible computational fabrics as illustrated below:
The breakthrough represented by today’s announcement is that it renders Cascading extensible to a variety of computational frameworks and data fabrics and thereby expands the range of use cases and environments in which Cascading can be optimally used. Moreover, the customizable query planner featured in today’s release allows customers to configure their Cascading deployment to operate in conjunction with emerging technologies and data fabrics that can now be integrated into a Cascading deployment by means of the functionality represented in Cascading 3.0.
Used by companies such as Twitter, eBay, FourSquare, Etsy and The Climate Corporation, Cascading boasts over 150,000 applications a month, more than 7,000 deployments and 10% month over month growth in downloads. The release of Cascading 3.0 builds on Concurrent’s recent partnership with Hortonworks whereby Cascading will be integrated into the Hortonworks Data Platform and Hortonworks will certify and support the delivery of Cascading in conjunction with its Hadoop distribution. Concurrent, Inc. also recently revealed details of a strategic partnership with Databricks, the principal steward behind the Apache Spark project, that allows it to “operate over Spark…[the] next generation Big Data processing engine that supports batch, interactive and streaming workloads at scale.” In an interview with Cloud Computing Today, Concurrent CEO Gary Nakamura confirmed that Concurrent plans to negotiate partnerships analogous to the agreement with Hortonworks with other Hadoop distribution vendors in order to ensure that Cascading consolidates its positioning as the framework of choice for the development of Big Data applications. Overall, the release of Cascading 3.0 represents a critical product enhancement that positions Cascading to operate over a broader pasture of computational frameworks and consequently assert its relevance for Big Data application development in a variety of data and computational frameworks. More importantly, however, the product enhancement in Cascading 3.0, in conjunction with the partnership with Databricks regarding Apache Spark, suggests that Cascading is well on its way to becoming the universal framework of choice for developing and managing applications in a Big Data environment, particularly given its compatibility with a wide range of Hadoop distributions and data and computational frameworks.
Concurrent and Hortonworks recently revealed a deepening of their strategic relationship whereby Cascading SDK will now be integrated into the Hortonworks Data Platform. Moreover, Hortonworks will certify, deliver and support Cascading, the application framework for developing Hadoop-based applications. A Java-based, open source alternative to MapReduce, Cascading provides developers with a framework for constructing complex, repeatable data processing tasks within a Hadoop cluster. Cascading features an abstraction platform which uses plumbing metaphors such as taps, pipes, data flows, cascades and sinks to allow developers to design, visualize and execute jobs and processes on Hadoop-based data without having to master the intricacies of MapReduce. Forthcoming releases of Cascading will support Apache Tez, an initiative that represents the next step after the addition of YARN to Hadoop that allows for Hadoop-based data to “meet demands for fast response times and extreme throughput at petabyte scale.” The partnership between Concurrent, the developer of Cascading, and Hortonworks, represents a huge coup for Concurrent given that the collaboration stands to rapidly accelerate Cascading’s adoption in enterprise environments. Hortonworks, meanwhile, benefits from packaging its Hadoop distribution with Cascading, one of the industry’s most well respected frameworks for Big data management and application development that boasts enterprise users such as Twitter, LinkedIn, eBay and Nokia. The obvious question now is whether Concurrent will finalize similar partnerships with other Hadoop vendors such as Cloudera and MapR or whether Concurrent’s partnership with Hortonworks enables the latter to improve its positioning in the battle for Hadoop market share, particularly in light of Cloudera’s remarkable $900 capital raise and partnership with Intel.
Concurrent Inc., the primary sponsor behind Cascading, today announces the release of Driven, an application performance management solution for Big Data applications. Driven enables developers to quickly identify and remediate application failures and performance issues specific to applications built using Hadoop. Available as a plug-in for the Cascading infrastructure, Driven solves a key problem in the Hadoop industry related to the management of Hadoop-based applications. The use of Driven allows developers to confirm the successful execution of application jobs and data processing algorithms, in addition to facilitating the optimization of application performance. Developers can monitor and trend application metrics such as runtime parallelization for both operational and R&D purposes. Moreover, because Driven is part of the Java-based Cascading framework for building analytics and data management applications on Apache Hadoop, Driven users can take advantage of Cascading’s collaboration functionality to communicate with Driven communities all over the world.
Chris Wensel, founder and CTO, Concurrent, Inc., remarked on the significance of Driven as follows:
Driven is a powerful step forward in delivering on the full promise of connecting business with Big Data. Gone are the days when developers must dig through log files for clues to slow performance or failures of their data processing applications. The release of Driven further enables enterprise users to develop data oriented applications on Apache Hadoop in a more collaborative, streamlined fashion. Driven is the key to unlock enterprises’ ability to drive differentiation through data. There’s a lot more to come – this is only the beginning.
Here, Wensel notes the way in which Driven responds to the opacity of Hadoop by providing developers with an alternative to sloughing through volumes of log files to understand the performance of their applications. Concurrent CEO Gary Nakamura elaborated on Wensel’s remarks by noting that “One of the big problems in Hadoop today is it’s just a black box,” and that Driven provides a way to expeditiously navigate to lines of code that are responsible for application failure. Because of its positioning as part of the Cascading infrastructure, Driven stands to significantly enhance the value of Cascading by providing developers with an extra layer of insight into application performance that complements Cascading’s indigenous framework for big data analytics and data management. Expect Driven to vault the status of Cascading within the Big Data industry even further and ultimately confirm its place as the go to application for Hadoop analytics, data and application management. Driven is currently available in public Beta whereas its commercial variant, Driven Enterprise, will be available in Q2 via an annual subscription.
Today, Concurrent elaborates on the release of Cascading 2.5, the open source framework for facilitating the development of applications on Apache Hadoop. Cascading 2.5 supports the recent released Hadoop 2.0 distribution including YARN and its other features. Cascading users that are interested in upgrading to Hadoop 2.0 can do so by means of Cascading 2.5. Similarly, applications that leverage the Scalding, Cascalog and PyCascading languages can migrate to Hadoop 2.0 as well by means of the Cascading 2.5 framework. The latest release of Cascading also features “complex join operations and optimizations to dynamically partition and store processed data more efficiently on HDFS,” according to the Concurrent’s press release. Finally, the release deepens its compatibility with other Hadoop distributions and Hadoop as a Service vendors such as Cloudera, Hortonworks, MapR, Intel, Altiscale, Qubole and Amazon EMR.
Cascading 2.5 represents one of the few products in either the commercial or open source ecosystem for simplifying the development of Hadoop applications while integrating with a rich and varied ecosystem of products as illustrated below:
The graphic shows how Cascading 2.5 supports all major Hadoop distributions in addition to an impressive list of development languages, database platforms and cloud platforms. In an interview with Cloud Computing Today, Concurrent CEO Gary Nakamura and CTO Chris Wensel noted the uniqueness of Cascading in the Big Data landscape, particularly given its iterative refinement in collaboration with the likes of Twitter, eBay and The Climate Corporation over a period of more than five years.
Today’s announcement regarding the general availability of Cascading 2.5 is accompanied by news of the general availability of Lingual, an ANSI-compliant SQL interface that allows developers to use SQL commands to query data stored in Hadoop clusters. Unlike Apache’s Hive project, Lingual’s ANSI-standard SQL interface enables developers to deploy authentic SQL commands as opposed to HIVE’s SQL-like syntax. Cascading Lingual also allows for the migration of legacy SQL workloads onto Hadoop clusters, the export of Hadoop data onto BI tools such as Jaspersoft, Pentaho and Talend, and the ability to leverage the power of Cascading in conjunction with SQL to orchestrate the execution of multiple SQL queries instead of several, discrete disparate queries. The Big Data space should expect more from Concurrent as it continues to build out tools for simplifying application development on Hadoop, particularly as more and more Hadoop developers come to terms with Cascading’s advantages over MapReduce.
Today, Concurrent Inc. announces the release of Pattern, an open source tool designed to enable developers to build machine-learning applications on Hadoop by leveraging the Predictive Model Markup Lanaguage (PMML), the standard export format for popular predictive modeling tools such as R, MicroStrategy and SAS. Data scientists can use Pattern to export applications to Hadoop clusters and thereby run them against massive data sets. Pattern simplifies the process of building predictive models that operate on Hadoop clusters and lowers the barrier to the adoption of Apache Hadoop for advanced data mining and modeling use cases.
An example of a use case for Pattern includes evaluating the efficacy of models for a “predictive marketing intelligence solution” as illustrated below by Antony Arokiasamy, Senior Software Architect at AgilOne:
Pattern facilitates AgilOne to deploy a variety of advanced machine-learning algorithms for our cloud-based predictive marketing intelligence solution. As a self-service SaaS offering, Pattern allows us to evaluate multiple models and push the clients’ best models into our high performance scoring system. The PMML interface allows our advanced clients to deploy custom models.
Here, Arokiasamy remarks on the way in which Pattern facilitates scoring of predictive models that enables the selection of one model amongst others. AgilOne uses Pattern to run multiple predictive models in parallel against large data sets and additionally illustrates the efficacy of Pattern’s operation on a Hadoop cluster deployed in a cloud-based environment.
Pattern runs on the popular Cascading framework for simplifying the deployment and management of Hadoop clusters that is used by the likes of Twitter, eBay, Etsy and Razorfish. A free, open source application, Pattern constitutes yet another pillar in Concurrent’s array of applications for streamlining the use of Apache Hadoop alongside Cascading and Lingual, the ANSI-standard interface that enables developers to leverage SQL to query Hadoop clusters without having to learn MapReduce. The release of Pattern consolidates the positioning of Concurrent as a pioneer in the Big Data management space given its thought leadership in designing applications that facilitate enterprise adoption of Hadoop. Enterprises can now use Concurrent’s Cascading framework to operate on Hadoop clusters using JAVA APIs, SQL and predictive models written in PMML compatible analytics applications.