Teradata continued its spending spree by acquiring the Mountain View, CA-based Hadoop consulting firm Think Big Analytics on Wednesday. The acquisition of Think Big Analytics will supplement Teradata’s own consulting practice. Think Big Data Analytics, which has roughly 100 employees, specializes in agile SDLC methodologies for Hadoop consulting engagements that typically last more than a month but less than a quarter of a year. According to Teradata Vice President of Product and Services Marketing Chris Twogood, Teradata has “now worked on enough projects that it’s been able to build reusable assets” as reported in PCWorld. Think Big Analytics will retain its branding and its management team will remain at the company’s Mountain View office. Teradata’s acquisition of Think Big Analytics comes roughly two months after its purchase of Revelytix and Hadapt. Revelytix provides a management framework for metadata on Hadoop whereas Hadapt’s technology empowers SQL developers to manipulate and analyze Hadoop-based data. Teradata’s third Big Data acquisition in less than two months comes at a moment when the Big Data space is exploding with a proliferation of vendors that differentially tackle the problem of data discovery, exploration, analysis and visualization with respect to Hadoop-based data. The question now is whether the industry will experience early market consolidation as evinced by startups snapped up by larger vendors or whether the innovation that startups provide will be able to survive a land grab in the Big Data space initiated by larger, well capitalized companies seeking to complement their Big Data portfolio with newly minted Big Data products and technologies. Terms of Teradata’s acquisition of Think Big Analytics were not disclosed.
Trifacta recently announced a deeper integration of its Data Transformation platform with Tableau, the leader in data visualization and business intelligence, as a key feature of the release of the Trifacta Data Transformation Platform 1.5. The Trifacta Data Transformation Platform 1.5 allows customers to export Trifacta data to a Tableau Data Extract format or register it with Hadoop’s HCatalog to facilitate the integration of Hadoop-based data from Trifacta into Tableau. Trifacta’s Chief Strategy Officer Joe Hellerstein remarked on the significance of the deeper integration with Tableau as follows:
Tableau creates huge opportunities for effectively analyzing data, but working with big data poses specific challenges. The most significant barriers come from structuring, distilling and automating the transfer of data from Hadoop. Our integration removes these barriers in a way that complements self-service data analysis. Now, Trifacta and Tableau users can move directly from big data in Hadoop to powerful, interactive visualizations.
Trifacta’s ability to output data to Tableau Data Extract format means that its customers can more seamlessly integrate Trifacta data with Tableau and reap the benefits of its renowned data visualization capabilities. The Trifacta Data Transformation platform specializes in enhancing analyst productivity in relation to Big Data sets by delivering a machine learning-based user interface that allows analysts to explore, transform, cleanse, visualize and manipulate massive data sets. Moreover, Trifacta’s predictive interaction technology iteratively learns from analyst behavior and offers users guided suggestions about productive paths for data discovery and exploration. The announcement of Trifacta’s deepened integration with Tableau means that Trifacta data which has experienced a process of transformation now encounters a streamlined segue to the Tableau platform. Meanwhile, the deepened partnership between the two vendors positions Tableau to consolidate its market positioning as the de facto business intelligence platform for Hadoop-based data.
Pivotal and Hortonworks will collaborate to accelerate development of Apache Ambari, the open source framework for provisioning, managing and monitoring Hadoop clusters. Pivotal will dedicate engineers toward advancing the “installation, configuration and management capabilities” of Apache Ambari as part of the larger project of contributing to software that promotes adoption of Apache Hadoop. In a blog post, Pivotal’s Jamie Buckley elaborated on the value of Apache Ambari to the Hadoop ecosystem as follows:
Apache Hadoop projects are central to our efforts to drive the most value for the enterprise. An open source, extensible and vendor neutral application to manage services in a standardized way benefits the entire ecosystem. It increases customer agility and reduces operational costs and can ultimately help drive Hadoop adoption.
Here, Buckley remarks on the way in which Ambari enhances the process of deploying and managing Hadoop by reducing costs and increasing the flexibility of customer choices regarding the operationalization of Hadoop. Meanwhile, Shaun Connolly, VP Strategy at Hortonworks, commented on the significance of Pivotal’s contribution to the Apache Ambari project as follows:
Pivotal has a strong record of contribution to open source and has proven their commitment with projects such as Cloud Foundry, Spring, Redis and more. Collaborating with Hortonworks and others in the Apache Hadoop ecosystem to further invest in Apache Ambari as the standard management tool for Hadoop will be quite powerful. Pivotal’s track record in open source overall and the breadth of skills they bring will go a long way towards helping enterprises be successful, faster, with Hadoop.
Connolly highlights Pivotal’s historical commitment to open source projects such as Cloud Foundry and its track record of success helping enterprises effectively utilize Apache Hadoop. Hortonworks stands to gain from Pivotal’s extraordinary engineering talent and reputation for swiftly releasing production-grade code for Big Data management and analytics applications. Meanwhile, Pivotal benefits from enriching an open source project that both vendors refer to in the context of a “standard” management tool for the Apache Hadoop ecosystem. The real winner, however, is Hortonworks, who now can claims the backing of Pivotal for the open source project Ambari incubated by some of its engineers, but also reaps the benefits of dedicated engineering staff from Pivotal that will almost certainly accelerate the rate of development of Ambari. The only qualification, here, is that Pivotal’s collaboration with Hortonworks is likely to ensure the optimization of Ambari for both the Pivotal HD and Hortonworks distribution, with the ancillary consequence that Ambari may be less suited for other Hadoop distributions such as Cloudera and MapR. Regardless, the collaboration between Hortonworks and Pivotal promises to serve as a huge coup for the Big Data industry at large both with respect to expediting development of Apache Ambari, and constituting a model for collaboration between competitors in the Big Data space that will ultimately enhance Hadoop adoption and effective utilization.
On Thursday, HP announced an agreement to invest $50M in Hortonworks. HP’s investment builds on the $100M Hortonworks raised in March in a financing red led by funds managed by Blackrock and Passport Capital as well as existing investors. The investment illustrates HP’s commitment to its reseller relationship with Hortonworks that allows it to resell the Hortonworks Data Platform. Moreover, HP plans to continue refining the engineering of its products such that they integrate with YARN, the resource management component of version 2.x of Hadoop. In addition to preparing its products to operate in conjunction with YARN, HP will be integrating its product architecture to optimally perform in conjunction with the Hortonworks Data Platform more generally. Key HP products targeted for integration with the Hortonworks Data Platform include the HP HAVEn platform, one component of which is HP Vertica. As a result of the $50M equity investment, HP’s Executive Vice President and Chief Technology Officer Martin Fink will join the board of directors of Hortonworks. HP’s investment in Hortonworks underscores how the Big Data revolution lies poised to accelerate as technology companies deepen their relationships with Hadoop vendors in anticipation of delivering turnkey big data analytics solutions that simplify and streamline the operationalization of Big Data.
Databricks, the company founded by the team that developed Apache Spark, recently announced the finalization of $33M in Series B funding in a round led by New Enterprise Associates with existing participation from Andreessen Horowitz. The company also revealed plans for commercializing Apache Spark by means of the newly launched Databricks Cloud that simplifies the data pipeline for data storage, ETL processing and thereupon running analytics and data visualizations on cloud-based Big Data. Powered by Apache Spark, the Databricks Cloud leverages Spark’s array of capabilities for operating on Big Data such as its ability to operate on streaming data, perform graph processing, offer SQL on Hadoop as well as its machine learning functionality. The platform aims to deliver a streamlined data pipeline for ingesting, analyzing and visualizing Hadoop-based data in a way that dispels the need to utilize a combination of heterogeneous technologies. Databricks will initially offer the Databricks Cloud on Amazon Web Services but plans to expand its availability to other clouds in subsequent months.
On Monday, MapR Technologies announced the finalization of $110M in funding based on $80M in equity financing and $30M in debt financing. Google Capital led the equity funding in collaboration with Qualcomm Incorporated, Lightspeed Venture Partners, Mayfield Fund, NEA and Redpoint Ventures while MapR’s debt funding was financed by Silicon Valley Bank. The funding will be used to spearhead MapR’s explosive growth in the Hadoop distribution and analytics space as illustrated by a threefold increase in bookings in Q1 of 2014 as compared to 2013. Gene Frantz, General Partner at Google Capital, commented on Google Capital’s participation in the June 30 funding raise as follows:
MapR helps companies around the world deploy Hadoop rapidly and reliably, generating significant business results. We led this round of funding because we believe MapR has a great solution for enterprise customers, and they’ve built a strong and growing business.
Monday’s announcement comes soon after MapR’s news of its support for Apache Hadoop 2.x and YARN in addition to all five components of Apache Spark, the open source technology used for big data applications that specialize in interactive analytics, real-time analytics, machine learning and stream processing. The additional $110M in funding strongly positions MapR with respect to competitors Cloudera and Hortonworks given that Cloudera recently raised $900M and Hortonworks finalized $100M in funding. The news of MapR’s $110M funding also coincides with a recent statement from Hortonworks certifying the compatibility of YARN with Apache Spark as part of a larger announcement about the integration of Spark into the Hortonworks Data Platform (HDP) alongside its Hadoop security acquisition XA Secure and Apache Ambari for the provisioning and management of Hadoop clusters. With a fresh round of capital in the bank and backing from Google, the creators of MapReduce, MapR signals that the battle for Hadoop market share features a three horse race that is almost certain to intensify as vendors compete to streamline and simplify the operationalization of Big Data. In the meantime, Big Data-related venture capital continues to flow like water bursting out of a fire hydrant as the Big Data space tackles problems related to big data analytics, streaming big data and Hadoop security.