On Thursday, HP announced an agreement to invest $50M in Hortonworks. HP’s investment builds on the $100M Hortonworks raised in March in a financing red led by funds managed by Blackrock and Passport Capital as well as existing investors. The investment illustrates HP’s commitment to its reseller relationship with Hortonworks that allows it to resell the Hortonworks Data Platform. Moreover, HP plans to continue refining the engineering of its products such that they integrate with YARN, the resource management component of version 2.x of Hadoop. In addition to preparing its products to operate in conjunction with YARN, HP will be integrating its product architecture to optimally perform in conjunction with the Hortonworks Data Platform more generally. Key HP products targeted for integration with the Hortonworks Data Platform include the HP HAVEn platform, one component of which is HP Vertica. As a result of the $50M equity investment, HP’s Executive Vice President and Chief Technology Officer Martin Fink will join the board of directors of Hortonworks. HP’s investment in Hortonworks underscores how the Big Data revolution lies poised to accelerate as technology companies deepen their relationships with Hadoop vendors in anticipation of delivering turnkey big data analytics solutions that simplify and streamline the operationalization of Big Data.
Databricks, the company founded by the team that developed Apache Spark, recently announced the finalization of $33M in Series B funding in a round led by New Enterprise Associates with existing participation from Andreessen Horowitz. The company also revealed plans for commercializing Apache Spark by means of the newly launched Databricks Cloud that simplifies the data pipeline for data storage, ETL processing and thereupon running analytics and data visualizations on cloud-based Big Data. Powered by Apache Spark, the Databricks Cloud leverages Spark’s array of capabilities for operating on Big Data such as its ability to operate on streaming data, perform graph processing, offer SQL on Hadoop as well as its machine learning functionality. The platform aims to deliver a streamlined data pipeline for ingesting, analyzing and visualizing Hadoop-based data in a way that dispels the need to utilize a combination of heterogeneous technologies. Databricks will initially offer the Databricks Cloud on Amazon Web Services but plans to expand its availability to other clouds in subsequent months.
On Monday, MapR Technologies announced the finalization of $110M in funding based on $80M in equity financing and $30M in debt financing. Google Capital led the equity funding in collaboration with Qualcomm Incorporated, Lightspeed Venture Partners, Mayfield Fund, NEA and Redpoint Ventures while MapR’s debt funding was financed by Silicon Valley Bank. The funding will be used to spearhead MapR’s explosive growth in the Hadoop distribution and analytics space as illustrated by a threefold increase in bookings in Q1 of 2014 as compared to 2013. Gene Frantz, General Partner at Google Capital, commented on Google Capital’s participation in the June 30 funding raise as follows:
MapR helps companies around the world deploy Hadoop rapidly and reliably, generating significant business results. We led this round of funding because we believe MapR has a great solution for enterprise customers, and they’ve built a strong and growing business.
Monday’s announcement comes soon after MapR’s news of its support for Apache Hadoop 2.x and YARN in addition to all five components of Apache Spark, the open source technology used for big data applications that specialize in interactive analytics, real-time analytics, machine learning and stream processing. The additional $110M in funding strongly positions MapR with respect to competitors Cloudera and Hortonworks given that Cloudera recently raised $900M and Hortonworks finalized $100M in funding. The news of MapR’s $110M funding also coincides with a recent statement from Hortonworks certifying the compatibility of YARN with Apache Spark as part of a larger announcement about the integration of Spark into the Hortonworks Data Platform (HDP) alongside its Hadoop security acquisition XA Secure and Apache Ambari for the provisioning and management of Hadoop clusters. With a fresh round of capital in the bank and backing from Google, the creators of MapReduce, MapR signals that the battle for Hadoop market share features a three horse race that is almost certain to intensify as vendors compete to streamline and simplify the operationalization of Big Data. In the meantime, Big Data-related venture capital continues to flow like water bursting out of a fire hydrant as the Big Data space tackles problems related to big data analytics, streaming big data and Hadoop security.
Actian Announces “Right To Deploy” Pricing Model Marked By Freedom From Vendor Lock-In For Big Data Analytics
Big data analytics vendor Actian today announced the availability of customer-friendly pricing options that render it easier for customers to take advantage of its analytics platform for Apache Hadoop. Actian’s latest pricing options feature “capacity-based and subscription models” in addition to a Right to Deploy option that confers an expanded range of flexibility regarding deployment options for the Actian Analytics Platform. The Actian Analytics Platform delivers actionable business intelligence and advanced data visualization for Hadoop-based data that takes advantage of the platform’s proprietary predictive analytics algorithms and low latency. Moreover, the Actian Analytics platform’s Hadoop SQL Edition provides a SQL compliant Hadoop analytics platform that allows users to perform data discovery, data profiling and analytics via SQL in contrast to MapReduce. As of today’s announcement, Actian’s Right to Deploy option allows customers unlimited usage of the platform for a period of one, two or three years in addition to the right to use whatever has been deployed, forever. The Right to Deploy choice represents a particularly attractive option for customers that anticipate significant expansions in their business that dictate the need for enhanced infrastructure and application scalability. Moreover, the Right to Deploy option gives customers freedom from vendor lock-in by empowering customers to use their deployments whether they continue to partner with Actian or choose another vendor for their Hadoop analytics needs. Actian’s simplified platform pricing offers some of the greatest flexibility regarding Big Data analytics in the industry, in a red hot space marked by an increasing number of vendors large and small. That said, few vendors have streamlined and simplified the process of operationalizing Big Data analytics in a way that lays out programmatic approaches to obtaining meaningful analytics on Hadoop that vary in conjunction with the specific use case in mind. Expect increasing competition in the Hadoop analytics space to drive more and more vendors to differentiate themselves from the pack, although the main task, for the industry at large, consists of delivering a turnkey solution for big data analytics featuring machine learning-based, best practice recommendations for extracting meaningful analytics from massive, ever increasing amounts of data.
On Thursday, Hortonworks announced that Apache Spark is “YARN Ready” and compatible with the multiple workloads and additional CPU processing-demands specific to Spark applications. As a result of the compatibility of Apache Spark with YARN, Hadoop users can now use one Hadoop cluster with a single repository of data for a variety of purposes rather than having to segment workloads such that some data is dedicated to Apache Spark. More specifically, Hadoop users can now rest assured that YARN-based applications work collaboratively with applications that leverage Spark’s capabilities to facilitate real-time analytics, interactive analytics, machine learning and stream processing. Hortonworks introduced Apache Spark to the Hortonworks Data Platform as a technology preview download in May but today announces the integration of Spark with YARN, its recent acquisition, XA Secure, for authentication and data security purposes, as well as Ambari toward the larger goal of delivering an integrated, turnkey, enterprise-grade Hadoop platform. Thursday’s announcement by Hortonworks responds to similar statements by competitors MapR regarding the integration of Spark into its Hadoop distribution, and Cloudera’s announcement of its enterprise-grade support for Apache Spark.
The following graphic illustrating the integration of Spark into YARN originated from the Hortonworks blog post Making Apache Spark YARN Ready.
Cloudera recently acquired big data security vendor Gazzang to strengthen its security offerings for its Hadoop distribution and related offerings. Cloudera’s acquisition of Gazzang will provide “enterprise-grade data encryption and key management.” In addition, the Gazzang team will constitute the foundation of the Cloudera Center for Security Excellence dedicated to the development of comprehensive Hadoop security solutions. Cloudera’s acquisition of Gazzang comes weeks after the announcement of the acquisition of XA Secure by Hortonworks to obtain access to a comprehensive security solution for Hadoop that addresses issues such as user authentication, authorization and audit and control. That Cloudera and Hortonworks acquired dedicated Hadoop security companies in the space of a month illustrates the intensity of the need in the Big Data space to package proven Hadoop security technologies in conjunction with Hadoop deployments and third party tools for optimizing Hadoop analytics and data management. Cloudera, for example, actively contributes to the open source initiative Project Rhino that seeks to augment the data protection functionality of Hadoop and contribute the resulting code back to the Apache Software Foundation. The bottom line is that Hadoop security has suddenly emerged as an urgent vertical within the Big Data space that testifies to the increasing prevalence and scale of the deployment of Hadoop distributions in the enterprise.
DataTorrent Announces General Availability Of Platform For Real-Time Analytics On Streaming Hadoop Data
DataTorrent recently announced the general availability of DataTorrent Real-Time Streaming, a platform that delivers real-time analytic capabilities on Apache Hadoop that allow users to obtain actionable business intelligence from streams of Hadoop data. DataTorrent Real-Time Streaming boasts the ability to run analytics on streams of Hadoop data at volumes of over 1 billion events per second by using in-memory processing with low to zero latency. Whereas comparable technologies such as Spark Streaming from Apache Spark split a stream of Hadoop data into segments and performs in-memory processing, DataTorrent Real-Time Streaming operates directly on Hadoop containers without scheduling batches of Hadoop streams for processing. By avoiding the scheduling overhead associated with processing “mini-batches” of Hadoop data, DataTorrent claims operational efficiencies that allow it to process more Hadoop events with sub-second latency than competing products.
Phu Hoang, co-founder and CEO, DataTorrent, remarked on the innovation enabled by DataTorrent Real-Time Streaming as follows:
Hadoop has made big data analytics a reality; however, the true value of big data is unlocked when it can be acted upon in real-time. DataTorrent Real-Time Streaming is designed specifically to address this need for the enterprise. Through the advances provided by Hadoop 2.0, we are proud to raise the bar on real-time analytics to offer the industry’s first true real-time data ingestion and analysis platform at scale.
Designed specifically for Hadoop 2.0 and the enhancements enabled by YARN, DataTorrent RTS also boasts the ability to perform complex, high performance computation on streaming Hadoop data with high availability. Certified to work with Hadoop distributions from Cloudera, Hortonworks and MapR, DataTorrent RTS represents a commercial product that plays in the increasingly hot space constituted by products intended for real-time analytics on streaming Big Data alongside the likes of Apache Storm, Apache Spark and Amazon Kinesis. Questions of performance aside, one of the keys to DataTorrent’s success will be its ease of implementation and ability to simplify and streamline the derivation of meaningful analytics from streaming Hadoop data. To date, the Santa Clara-based company has raised $8M in Series A funding in a round led by August Capital.