In a stunning move that is likely to shape the Big Data space for years, Intel recently decided to partner with Cloudera to support its Hadoop distribution rather than enhancing Intel’s own Hadoop distribution. Cloudera will optimize its Hadoop distribution (CDH) to work with Intel’s hardware technology and Intel, conversely, will promote CDH as the Hadoop distribution of choice of enterprise Big Data analytics and the internet of things. Meanwhile, Intel will contribute insights from its own Hadoop distribution to Cloudera’s Hadoop distribution (CDH) and the resulting integration will be rendered available as part of Cloudera’s open source Hadoop initiatives. The partnership between Intel and Cloudera also featured an equity investment by Intel between $740M to $760M that translates into an 18% ownership stake in Cloudera. The $740M invested by Intel brings Cloudera’s recent funding raises to roughly $900M subsequent to its $160M funding raise in mid-March. Intel will join Cloudera’s board of directors and become “Cloudera’s largest strategic shareholder.” According to its press release, Intel’s investment in Cloudera represents Intel’s “single largest data center technology investment in its history.” Intel’s strong presence in countries such as India and China where Cloudera has thus far failed to gain traction means that the partnership stands to dramatically expand Cloudera’s global market share significantly. More importantly, however, Intel’s deep integration with the technologies in almost every datacenter worldwide render it a formidable ally for Cloudera to fulfill its aspiration of becoming the leading Hadoop distribution in the world in ways that promise to transform computing hardware as well as the Hadoop distributions that integrate with Intel’s Xeon technology.
Intel recently elaborated details of its Intel Data Platform, a suite of software applications designed to facilitate analytics on big data. The platform will complement the Intel Distribution for Apache Hadoop by providing a wealth of graph analytic and predictive modeling functionality via the Intel Data Platform: Analytics Toolkit that enables data scientists to derive actionable business intelligence from big data sets. In addition to raw analytic and data visualization capabilities, the Intel Data Platform features the ability to process streaming data sets and perform iterative and interactive analytics. The Intel Data Platform’s Analytics Toolkit provides users with algorithms related to graph analytics and machine learning that enable enhanced fraud detection, customer profiling and big data management and processing. For example, China Mobile Guangdong was able to implement online billing to the point where it could add up to 800,000 new records/second or up to 30 terabytes of data per month. Similarly, the platform has been used to help retailers nimbly respond to social media promotions by ensuring shelves are appropriately stocked in response to the spikes in consumer demand that result from promotions on platforms such as Twitter and Facebook, media announcements or seasonal changes including unanticipated weather. The Intel Data Platform exemplifies the proliferation of Big Data analytics solutions that have emerged as more and more enterprises perform experiments with Big Data of varying intensity. The platform will be available in Q2 of 2014 in Enterprise and Premium Editions that differ according to the degree of available customer support.
On Tuesday, Intel announced the release of a Hadoop distribution optimized for its own Xeon processors with a value proposition based on performance and security. Built “from the silicon up,” the distribution delivers encryption for Intel Xeon processors with support for Intel AES-NI instructions. Native support for encryption means that organizations can safely manage storage and analytics on petabytes of data without leveraging 3rd party encryption software. Moreover, the distribution has also been optimized for superior performance by way of enhancements to the networking and IO components of the Xeon processor platform. As a result of the optimization of Hadoop for Intel’s Xeon processors, an analysis of one terabyte of data which usually took four hours can now be completed in seven minutes on Intel’s distribution of Hadoop. The product also features management tools such as the Intel Manager for Apache Hadoop and the Intel Active Tuner for Apache Hadoop. The Intel Manager for Apache Hadoop streamlines deployment and monitoring of Hadoop clusters whereas the Intel Active Tuner for Apache Hadoop configures the cluster to ensure optimal performance.