Cloudera’s Acquisition Of Gazzang Underscores Urgency Around Hadoop Security Technologies

Cloudera recently acquired big data security vendor Gazzang to strengthen its security offerings for its Hadoop distribution and related offerings. Cloudera’s acquisition of Gazzang will provide “enterprise-grade data encryption and key management.” In addition, the Gazzang team will constitute the foundation of the Cloudera Center for Security Excellence dedicated to the development of comprehensive Hadoop security solutions. Cloudera’s acquisition of Gazzang comes weeks after the announcement of the acquisition of XA Secure by Hortonworks to obtain access to a comprehensive security solution for Hadoop that addresses issues such as user authentication, authorization and audit and control. That Cloudera and Hortonworks acquired dedicated Hadoop security companies in the space of a month illustrates the intensity of the need in the Big Data space to package proven Hadoop security technologies in conjunction with Hadoop deployments and third party tools for optimizing Hadoop analytics and data management. Cloudera, for example, actively contributes to the open source initiative Project Rhino that seeks to augment the data protection functionality of Hadoop and contribute the resulting code back to the Apache Software Foundation. The bottom line is that Hadoop security has suddenly emerged as an urgent vertical within the Big Data space that testifies to the increasing prevalence and scale of the deployment of Hadoop distributions in the enterprise.

BigCouch Integration With CouchDB Brings Clustering And Improved Database Compaction To CouchDB

On Monday, Database as a Service vendor Cloudant announced plans to integrate its database service, BigCouch, into the Apache CouchDB project. BigCouch is an open source fork of CouchDB designed to support large-scale, distributed applications. The integration of BigCouch with CouchDB will provide CouchDB with enhanced scalability and performance in a move that is likely to accelerate adoption of the NoSQL CouchDB platform. In conjunction with its decision to integrate BigCouch into CouchDB, Cloudant announced that it will cease development of the BigCouch platform that was inspired by Amazon’s famous Dynamo research paper.

CouchDB will benefit principally from the clustering functionality that became one of the trademarks of BigCouch. Unlike CouchDB, BigCouch nodes reside in elastic clusters marked by consistent hashing, quorum rules for read/write operations and parallel indexing on data partitions as illustrated by the three node BigCouch development cluster below, in contrast to the unified CouchDB configuraton at the top of the picture:

Graphic source: Cloudant’s BigCouch is open-source

Parallel indexing across clusters allows the BigCouch configuration to demonstrate significant improvements in indexing speed in comparison to serial indexing of one database. CouchDB will also benefit from BigCouch’s database compaction functionality, replication speed and high-concurrency access performance.

Adam Kocoloski, co-founder and CTO at Cloudant, remarked on the merging of BigCouch with CouchDB as follows:

There are a lot of reasons people love CouchDB, like its elegant programming model, data durability, flexible indexing, and, most of all, its unique way of replicating and synching data across data centers or devices. We’re merging the horizontal scaling and fault-tolerance framework we built for BigCouch into CouchDB so people can more easily scale all that CouchDB goodness across multiple servers and keep it running nonstop. It’s our way of saying thanks and helping to grow the community of CouchDB developers and users.

Interested users can access a preview of the merger of CouchDB and BigCouch now, although the generally available version of the integrated database as a service will be released in conjunction with the release cycles of the Apache Foundation’s code release process. The integration of these two open source platforms represents a significant boost to the NoSQL community as options in the NoSQL space continue to proliferate and deepen in functionality as exemplified by Garantia’s recent acquisition of MyRedis.