DataRPM today announced the finalization of $5.1M in Series A funding in a round led by InterWest Partners. DataRPM specializes in a next generation business intelligence platform that leverages machine learning and artificial intelligence to facilitate the delivery of actionable business intelligence by means of a natural language-based search engine that allows customers to dispense with complex, time consuming data modeling and query production. DataRPM stores customer data within a “distributed computational search index” that enables its platform to apply its natural language query interface to heterogeneous data sources without modeling the data into intricate taxonomic relationships or master data management frameworks. Because DataRPM’s distributed computational search index empowers customers to run queries against different data sources without constructing data schemas that organize the constituent data fields and their relationships, it promises to accelerate the speed with which customers can derive insights from their data. Not only does the platform deliver a natural language interface, but it also performs data visualization of the requisite Google-like searches as illustrated below:
In an interview with Cloud Computing Today, DataRPM CEO Sundeep Sanghavi noted that its natural language search functionality is based on proprietary graphing technology analogous to Apache Giraph and Neo4j. The platform operates on data in relational and non-relational formats, although it currently does not support unstructured data. Available via both a cloud-based and on-premise deployment solution, DataRPM promises to disrupt Big Data analytics and contemporary business intelligence platforms by dispensing with the need for complex, time consuming and expensive data modeling as well as empowering business stakeholders with neither SQL nor scripting skills to analyze data. Today’s funding raise is intended to accelerate the company’s go-to-market strategy and correspondingly support product development in conjunction with the platform’s reception by current and future customers.
DataRPM belongs to the rapidly growing space of products that expedite Big Data analytics on Hadoop clusters as exemplified by the constellation of SQL-like interfaces for querying Hadoop-based data. That said, its natural language query interface represents a genuine innovation in a space dominated by products that render Hadoop accessible to SQL developers and analysts, as opposed to data savvy stakeholders with Google-like querying expertise. Moreover, DataRPM’s natural language search capabilities push the envelope of “next generation business intelligence” even further than contemporaries such as Jaspersoft, Talend and Pentaho, which thus far have focused largely on the transition within the enterprise from reporting to analytics and data discovery. Expect to hear more about DataRPM as the battle to streamline and simplify the derivation of actionable business intelligence from Big Data takes shape within a vendor landscape marked by the proliferation of analytic interfaces for petabyte-scale relational and non-relational databases.
Hot on the heels of its $12M Series B funding in December, Trifacta recently announced the general availability of the Trifacta Data Transformation Platform. Based on its innovative Predictive Interaction™ technology, the Trifacta Data Transformation Platform uses visualization and machine learning to streamline and enrich user-level interactions with Big Data such as the type experienced by data scientists and business analysts. Trifacta’s Predictive Interaction technology features three components: (1) visualization of big data that empowers analysts to specify trends, values or analytics of interest; (2) interaction whereby the analyst responds to the data visualizations; and (3) prediction of the data transformations suggested by user interactions, with corresponding visualizations of the data transformation. The platform’s machine learning capability iteratively responds to user behavior to generate analytics of increasing value and interest. As a result, users can swiftly proceed from a raw, unprocessed archive of big data to incisive analytics and visualizations without the pre-processing, data cleansing and data transformation steps that are typically necessary to obtain deeper insights into about the data in question. The Trifacta Data Transformation Platform enables business analysts without scripting experience to derive nuanced insights about big data and additionally amplifies analyst productivity by means of its unique visualization and machine learning technology platform. Trifacta Customers include Lockheed Martin and Accretive Health, both of which remarked on the way in which the Trifacta Data Transformation Platform accelerates the data analysis lifecycle and streamlines user workflows. Trifacta’s technology is unique in the Big Data industry because of its focus on streamlining and enhancing the end user of big data analysis. Given the ubiquity of data visualization in the industry, much of the platform’s ability to differentiate itself will hinge on the sophistication of its predictive modeling and machine learning capabilities.
Today, Concurrent Inc.announces the release of Pattern, an open source tool designed to enable developers to build machine-learning applications on Hadoop by leveraging the Predictive Model Markup Lanaguage (PMML), the standard export format for popular predictive modeling tools such as R, MicroStrategy and SAS. Data scientists can use Pattern to export applications to Hadoop clusters and thereby run them against massive data sets. Pattern simplifies the process of building predictive models that operate on Hadoop clusters and lowers the barrier to the adoption of Apache Hadoop for advanced data mining and modeling use cases.
An example of a use case for Pattern includes evaluating the efficacy of models for a “predictive marketing intelligence solution” as illustrated below by Antony Arokiasamy, Senior Software Architect at AgilOne:
Pattern facilitates AgilOne to deploy a variety of advanced machine-learning algorithms for our cloud-based predictive marketing intelligence solution. As a self-service SaaS offering, Pattern allows us to evaluate multiple models and push the clients’ best models into our high performance scoring system. The PMML interface allows our advanced clients to deploy custom models.
Here, Arokiasamy remarks on the way in which Pattern facilitates scoring of predictive models that enables the selection of one model amongst others. AgilOne uses Pattern to run multiple predictive models in parallel against large data sets and additionally illustrates the efficacy of Pattern’s operation on a Hadoop cluster deployed in a cloud-based environment.
Pattern runs on the popular Cascading framework for simplifying the deployment and management of Hadoop clusters that is used by the likes of Twitter, eBay, Etsy and Razorfish. A free, open source application, Pattern constitutes yet another pillar in Concurrent’s array of applications for streamlining the use of Apache Hadoop alongside Cascading and Lingual, the ANSI-standard interface that enables developers to leverage SQL to query Hadoop clusters without having to learn MapReduce. The release of Pattern consolidates the positioning of Concurrent as a pioneer in the Big Data management space given its thought leadership in designing applications that facilitate enterprise adoption of Hadoop. Enterprises can now use Concurrent’s Cascading framework to operate on Hadoop clusters using JAVA APIs, SQL and predictive models written in PMML compatible analytics applications.