Cloudera and Trillium Software recently announced a collaboration whereby the Trillium Big Data solution is certified for Cloudera’s Hadoop distribution. As a result of the partnership, Cloudera customers can take advantage of Trillium’s data quality solutions to profile, cleanse, de-duplicate and enrich Hadoop-based data. Trillium responds to a problem in the Big Data industry wherein the customer focus on deployment and management of Hadoop-based data repositories eclipses concerns about data quality. In the case of Hadoop-based data, data quality solutions predictably face challenges associated with the sheer volume of data that requires cleansing or quality improvements. Trillium’s Big Data Solution for data quality cleanses data natively within Hadoop because identifying data with data quality issues and then transporting it to another infrastructure becomes costly and complex. The collaboration between Trillium Software and Cloudera illustrates the relevance of data quality solutions for Hadoop despite the increased attention currently devoted to Big Data analytics and data visualization solutions. As such, Trillium fills a critical niche within the Big Data processing space and its alliance with Cloudera positions it strongly to consolidate its early traction within the space of solutions dedicated to data quality in the Big Data space.
MapR has declined the invitation to participate in the Open Data Platform (ODP) after careful consideration, as noted in a recent blog post by John Schroeder, the company’s CEO and co-founder. Schroeder claims that the Open Data Platform is redundant with the governance provided by the Apache Software Foundation, that it purports to “solve” Hadoop-related problems that do not require solving and that it fails to accurately define the core of the Open Data Platform as it relates to Hadoop. With respect to software governance, Schroeder notes that the Apache Software Foundation has done well to steward the development of Apache Hadoop as elaborated below:
The Apache Software Foundation has done a wonderful job governing Hadoop, resulting in the Hadoop standard in which applications are interoperable among Hadoop distributions. Apache governance is based on a meritocracy that doesn’t require payment to participate or for voting rights. The Apache community is vibrant and has resulted in Hadoop becoming ubiquitous in the market in only a few short years.
Here, Schroeder credits the Apache Software Foundation with creating a Hadoop ecosystem in which Hadoop-based applications interoperate with one another and wherein the governance structure is based on a meritocracy that does not mandate monetary contributions in order to garner voting rights. In addition, the blog post observes that whereas the Open Data Platform defines the core of Apache Hadoop as MapReduce, YARN, Ambari and HDFS, other frameworks such as “Spark and Mesos, are gaining market share” and stand to complicate ODP’s definition of the core of Hadoop. Meanwhile, Cloudera’s Chief Strategy Officer Mike Olson explained why Cloudera also declined to join the Open Data Platform by noting that Hadoop “won because it’s open source” and that the partnership between Pivotal and Hortonworks was “antithetical to the open source model and the Apache way.” Given that 75% of Hadoop implementations use either MapR or Cloudera, ODP looks set to face some serious challenges despite support from IBM, Pivotal and Hortonworks, although the precise impact of the schism over the Open Data Platform on the Hadoop community remains to be seen.
Cloudera And Cask Partner To Align Cask’s Application Development Platform With Cloudera’s Hadoop Product Portfolio
Cloudera and Cask recently announced a strategic collaboration marked by a commitment to integrate the product roadmaps of both companies into a unified vision based around the goal of empowering developers to more easily build and deploy applications using Hadoop. As part of the collaboration, Cloudera made an equity investment in Cask, the company formerly known as Continuity. Cask’s flagship product consists of the Cask Data Application Platform (CDAP), an application platform used to streamline and simplify Hadoop-based application development in addition to delivering operational tools for integrating application components and performing runtime services. The integration of Cask’s open source Cask Data Application Platform with Cloudera’s open source Hadoop distribution represents a huge coup for Cask insofar as its technology stands to become tightly integrated with one of the most popular Hadoop distributions in the industry and correspondingly vie for potential acquisition by Cloudera as its product develops further. Cloudera, on the other hand, stands to gain from Cask’s progress in building a platform for facilitating Big Data application development that runs natively within a Hadoop infrastructure. By aligning its product roadmap with Cask, Cloudera adds yet another feather to its cap vis-à-vis tools and platforms within its ecosystem that enhance and accelerate the experience of Hadoop adoption. Overall, the partnership strengthens Cloudera’s case for going public by illustrating the astuteness and breadth of its vision when it comes to strategic partners and collaborators such as Cask, not to mention the business and technological benefits of the partnership. Expect Cloudera to continue aggressively building out its partner ecosystem as it hurtles toward an IPO that it may well be already preparing, at least as reported in Venturebeat.
Teradata and Cloudera Partner To Optimize Integration Of Big Data Platforms In Teradata Unified Architecture
Teradata today announced a partnership with enterprise Hadoop vendor Cloudera marked by the optimization of the integration between Teradata’s integrated data warehouse and Cloudera’s enterprise data hub. The collaboration between Teradata and Cloudera streamlines access to multiple data sources by means of the Teradata Unified Data Architecture (UDA). As a result of the integration, the Teradata Unified Data Architecture can access data from the Cloudera enterprise data hub by way of a unified Big Data infrastructure that has the capacity to perform data operations and analytics on massive, heterogeneous datasets featuring structured and unstructured data. As part of today’s announcement, Teradata also revealed details of Cloudera-certified connectors that can integrate with Apache Hadoop. Other components of the UDA that interface with Cloudera’s enterprise data hub include the Teradata QueryGrid, which allows users to pose analytical questions of data in both Teradata’s integrated data warehouse and the Cloudera enterprise data hub, in addition to the Teradata Loom, which enables tracking, exploration, cleansing and transformation of Hadoop files. Today’s announcement of the integration between Teradata’s integrated data warehouse and Cloudera’s enterprise data hub signals an important development in the Big Data space insofar because the alignment of the product roadmaps of the two vendors promises to position Teradata strongly via-a-via the development of Big data analytics and processing functionality. On Cloudera’s side, the partnership renders its enterprise data hub even more compatible with one of the industry’s most respected Big Data analytic platforms and prefigures the inking of even more partnerships between Hadoop and Big Data management vendors as a means of continuing to foster deeper hardware and software integration in the Hadoop management space.
Cloudera recently acquired big data security vendor Gazzang to strengthen its security offerings for its Hadoop distribution and related offerings. Cloudera’s acquisition of Gazzang will provide “enterprise-grade data encryption and key management.” In addition, the Gazzang team will constitute the foundation of the Cloudera Center for Security Excellence dedicated to the development of comprehensive Hadoop security solutions. Cloudera’s acquisition of Gazzang comes weeks after the announcement of the acquisition of XA Secure by Hortonworks to obtain access to a comprehensive security solution for Hadoop that addresses issues such as user authentication, authorization and audit and control. That Cloudera and Hortonworks acquired dedicated Hadoop security companies in the space of a month illustrates the intensity of the need in the Big Data space to package proven Hadoop security technologies in conjunction with Hadoop deployments and third party tools for optimizing Hadoop analytics and data management. Cloudera, for example, actively contributes to the open source initiative Project Rhino that seeks to augment the data protection functionality of Hadoop and contribute the resulting code back to the Apache Software Foundation. The bottom line is that Hadoop security has suddenly emerged as an urgent vertical within the Big Data space that testifies to the increasing prevalence and scale of the deployment of Hadoop distributions in the enterprise.
Trifacta Partners With Hortonworks To Certify Trifacta Data Transformation Platform On Hortonworks Data Platform
Trifacta today announced that its Trifacta Data Transformation Platform has been certified for use with Hortonworks Data Platform 2.1 (HDP) by means of the Hortonworks Certified Technology Program. The certification ensures the compatibility of the Trifacta Data Transformation Platform with the latest Hortonworks Data Platform and thereby positions Trifacta’s technology to integrate with enterprise-grade deployments of the Hortonworks Hadoop distribution. Today’s announcement further validates the value of the Trifacta Data Transformation Platform as a technology platform that facilitates the derivation of actionable business intelligence from Hadoop by rendering it easier for analysts to visualize and engage with Hadoop-based data in conjunction with machine learning-based suggestions regarding data transformations and analytics. Trifacta’s partnership with Hortonworks builds upon recent news of its $25M Series C raise and the finalization of an analogous collaboration with Hadoop vendor Cloudera. In March, Trifacta announced a partnership with Cloudera that ensures the compatibility of Trifacta’s Data Transformation Platform with the Cloudera Hadoop ecosystem.
Now that Trifacta has inked deals to certify its Data Transformation Platform with the two Hadoop market share leaders, Cloudera and Hortonworks, the Big Data space should expect enterprise deployments of its platform to accelerate as Trifacta solidifies its branding as the de facto platform for the transformation, cleansing and guided exploration of Hadoop-based data. The platform’s value proposition consists in the reduction of time to insight with respect to actionable business intelligence derived from Hadoop-based data, its ability to enhance analyst productivity and to iteratively deliver more nuanced guidance regarding data transformations of interest by means of its machine learning-based technology. Expect Trifacta to continue expanding its range of strategic partnerships in the forthcoming months as it leverages its recent funding to position itself at the forefront of enterprise technologies regarding the effective operationalization of Big Data.
Trifacta, the data transformation company, today announced the finalization of $25M in Series C funding. The funding round was led by a new investor, Ignition Partners, with additional participation from existing investors Greylock Partners and Accel Partners. As a result of the investment, Frank Artale, Managing Director of Ignition Partners, will join the Trifacta board of directors. The Trifacta Data Transformation Platform enhances the productivity of data analysts and scientists by transforming Big Data into a structure that renders it easier to analyze, visualize and manipulate. The Trifacta platform’s predictive interaction technology allows users to visualize Big Data, interact with different data visualizations and take advantage of machine-learning based predictions regarding data transformations and analytics of interest. The platform aims to deliver transparency regarding the data in question, agility with respect to the user’s ability to interact with Big Data, predictive intelligence based on machine learning about the efficacy of user interactions with data and scalability marked by the ability to interact with large, heterogeneous datasets. As told to Cloud Computing Today by Trifacta CEO Joe Hellerstein, Trifacta customers can take advantage of the platform’s ability to cleanse and organize Big Data in conjunction with other enterprise software platforms such as SAS, for example. That said, Trifacta itself offers its own universe of tools for facilitating insights with respect to Big Data and, unlike many business intelligence or analytics platforms, is designed specifically for the purpose of transformation, data discovery and visualization of massive datasets. Today’s announcement about Trifacta’s Series C funding comes hot on the heels of a March 2014 partnership with Cloudera to jointly deliver the Trifacta Data Transformation platform in conjunction with Cloudera’s Hadoop distribution. To date, Trifacta has raised a total of approximately $45M in funding. Given partnerships such as Cloudera in hand, and $25M in Series C funding that comes roughly 6 months after its Series B capital raise of $12M, the industry should expect Trifacta’s traction amongst Big Data customers to skyrocket as news about its ability to transform Big Data into a usable form that accelerates the development of actionable business intelligence proliferates.