RapidMiner Streams Integrates With Apache Storm To Tackle Advanced Analytics On Streaming Big Data

RapidMiner today announces the release of RapidMiner Streams, an application that leverages the power of Apache Storm to enable analytics on real-time streaming data. By integrating RapidMiner’s predictive analytics technology with Apache Storm, RapidMiner Streams enables real-time updates to RapidMiner’s advanced analytics in collaboration with incoming, streaming data that dictate iterative modifications to existing algorithms. The release of RapidMiner Streams means that the RapidMiner platform can now perform advanced analytics on streaming data with low latency, high throughput and enhanced computational performance. The integration of Apache Storm with RapidMiner adds to the over 1500 RapidMiner operators and subsequently enhances the platform’s ability to perform real-time predictive analytics. Use cases for RapidMiner’s analytics include analysis of free text customer feedback delivered online, data mining of millions of image files and analytics on wearable devices that prescribe users with an optimal exercise regimen to achieve their fitness goals based on historical data. RapidMiner’s analytics are enabled by way of a graphical user interface that allows analysts to combine operators for the ingestion, analysis and visualization of one or more datasets. Today’s announcement of the release of RapidMiner Streams and its concomitant integration with Apache Storm vaults the RapidMiner platform to tackle data from the internet of things and embrace use cases featuring massive volumes of streaming real-time data. As the universe of use cases that RapidMiner can handle proliferates, users should expect corresponding enhancements of the platform as customers enrich the product with feedback specific to data about their product and vertical.


BigPanda Emerges From Stealth To Manage Deluge Of IT Alerts And Notifications

BigPanda today launches from stealth to tackle the problem of managing the explosion of alerts and notifications that IT administrators increasingly receive, daily, from myriads of applications and devices. The Mountain View-based startup integrates alerts and notifications from disparate sources into a consolidated data feed that parses unstructured data into structured data to create an aggregated alerts and notifications data repository. BigPanda’s proprietary analytics subsequently run against the integrated data repository to enable the creation of topologies and relationships, time-based analytics and statistical analytics as indicated by the screenshot of an incident dashboard below:

Examples of statistical analytics include probabilistic determinations that the concurrent appearance of notification A, B and C is likely to lead to outcome X as suggested by historical data about the conjunction of the notifications in question. The platform’s machine-learning technology incrementally refines its analytics in relation to incoming data and thereby iteratively delivers more nuanced analyses and visualizations of notifications-related data. Overall, the platform enables customers to more effectively manage the tidal wave of data from notifications that bombard the inboxes of IT administrators by facilitating the derivation of actionable business intelligence based on the aggregation of notifications from discrete systems and applications.

As told to Cloud Computing Today by BigPanda CEO Assaf Resnick, the platform integrates with monitoring systems such as New Relic, Nagios and Splunk and additionally provides REST API functionality to connect to different applications, deployment infrastructures and ITSM tools. Moreover, BigPanda today announces the finalization of $7M in Series A funding in a round led by Mayfield with additional participation from Sequoia Capital. The $7M funding raise brings the total capital raised by BigPanda to $8.5M, following upon a $1.5M pre-Series A seed round of funding from Sequoia Capital. Deployed as a SaaS application that runs on AWS infrastructure while leveraging a MongoDB NoSQL datastore, BigPanda fills a critical niche in the IT management space by delivering one of the few applications aimed at consolidated notification management and analytics. As applications, infrastructure components and networking devices proliferate with dizzying complexity in the contemporary datacenter, platforms like BigPanda are likely to morph into necessary components of IT management as a means of taming the deluge of notifications produced by disparate systems. Meanwhile, BigPanda’s early positioning in the notification-management space renders it a thought leader as well as a technology standout.

PredictionIO Raises $2.5M For Open Source Machine Learning Server For Predictive Analytics

PredictionIO today announces the finalization of $2.5M in funding in a capital raise whose investors include Azure Capital QuestVP, CrunchFund, Stanford StartX-Fund, Kima Ventures, IronFire, Sood Venture and XG Ventures. The funding will be used to accelerate product development and marketing and sales and operations for the company’s open source machine learning server for predictive analytics. PredictionIO aspires to fill the role in the predictive analytics space played by MySQL in the relational database space by delivering an open source platform that empowers data scientists to both leverage a pre-defined library of predictive algorithms as well as create new algorithms that they can either choose to contribute to the platform, or keep to themselves. Built using Scala, the PredictionIO platform supports JVM and Java-based code as well as backend Hadoop-based data. Typical use cases for PredictionIO’s technology include the production of personalized content and recommendation engines, as well as algorithms that predict the behavior of users and industries based on historical trends. Available through the Amazon Web Services marketplace or via download, Prediction IO already boasts an open source user community of over 4000 developers and undergirds predictive analytics in “hundreds” of applications across of variety of verticals. The platform fills a critical niche in the big data analytics space by delivering an open source platform as a service-like infrastructure for the development of predictive analytics. Importantly, PredictionIO empowers companies who cannot afford to hire quant-level data scientists to quickly develop and tweak predictive models using its guided, machine learning-based user interface. That said, much of the success of PredictionIO will depend on the richness and variety of its library of pre-configured predictive modeling algorithms, but its initial round of funding represents a promising start toward accelerating adoption and expanding the platform’s impressive list of existing libraries and relevance for various use cases.

Microsoft Azure Reveals Azure ML, Cloud-Based Platform For Machine Learning And Predictive Analytics

Microsoft recently announced details of Azure ML, a platform for machine learning hosted on the Microsoft Azure cloud. Azure ML enables organizations to rapidly predict future trends such as crime, disease outbreaks, weather and traffic patterns. Whereas machine-learning and predictive analytics currently tend to be managed through on premise installations, Azure ML accelerates the pace with which data teams can obtain insights derived from historical data by making available a fully managed, scalable platform for machine learning that allows customers to focus on developing and refining predictive analytic parameters and algorithms without the burden of provisioning, managing and optimizing the infrastructure on which the applications are hosted. Azure ML will come pre-configured with “visual workflows and startup templates” that accelerate the process of developing predictive analytics. Moreover, the Azure ML platform will allow customers to expeditiously publish web services and APIs to facilitate collaboration between geographically dispersed teams. Currently, MAX451 is using a preview version of Azure ML to determine what retail customers are likely to purchase next while Carnegie Mellon is using the platform to understand variations in energy output across buildings on its university campus. Azure ML will be released in a public preview mode in July.

Concurrent Releases Pattern To Facilitate Predictive Analytics On Hadoop

Today, Concurrent Inc. announces the release of Pattern, an open source tool designed to enable developers to build machine-learning applications on Hadoop by leveraging the Predictive Model Markup Lanaguage (PMML), the standard export format for popular predictive modeling tools such as R, MicroStrategy and SAS. Data scientists can use Pattern to export applications to Hadoop clusters and thereby run them against massive data sets. Pattern simplifies the process of building predictive models that operate on Hadoop clusters and lowers the barrier to the adoption of Apache Hadoop for advanced data mining and modeling use cases.

An example of a use case for Pattern includes evaluating the efficacy of models for a “predictive marketing intelligence solution” as illustrated below by Antony Arokiasamy, Senior Software Architect at AgilOne:

Pattern facilitates AgilOne to deploy a variety of advanced machine-learning algorithms for our cloud-based predictive marketing intelligence solution. As a self-service SaaS offering, Pattern allows us to evaluate multiple models and push the clients’ best models into our high performance scoring system. The PMML interface allows our advanced clients to deploy custom models.

Here, Arokiasamy remarks on the way in which Pattern facilitates scoring of predictive models that enables the selection of one model amongst others. AgilOne uses Pattern to run multiple predictive models in parallel against large data sets and additionally illustrates the efficacy of Pattern’s operation on a Hadoop cluster deployed in a cloud-based environment.

Pattern runs on the popular Cascading framework for simplifying the deployment and management of Hadoop clusters that is used by the likes of Twitter, eBay, Etsy and Razorfish. A free, open source application, Pattern constitutes yet another pillar in Concurrent’s array of applications for streamlining the use of Apache Hadoop alongside Cascading and Lingual, the ANSI-standard interface that enables developers to leverage SQL to query Hadoop clusters without having to learn MapReduce. The release of Pattern consolidates the positioning of Concurrent as a pioneer in the Big Data management space given its thought leadership in designing applications that facilitate enterprise adoption of Hadoop. Enterprises can now use Concurrent’s Cascading framework to operate on Hadoop clusters using JAVA APIs, SQL and predictive models written in PMML compatible analytics applications.