MarkLogic 5 Features Hadoop Connector For Enhanced Big Data Analytics

With the November 1 release of MarkLogic 5, MarkLogic consolidated its position in the Big Data space by announcing support for Hadoop, the Apache open source software framework for analyzing massive amounts of structured and unstructured data. For over a decade, MarkLogic has delivered analytics that enable actionable intelligence on data for organizations such as JP Morgan Chase, Lexis Nexis and the U.S. Army. MarkLogic 5 features a connector for Hadoop that integrates Hadoop’s capabilities for processing petabytes of data with MarkLogic’s proprietary applications for analyzing Big Data. In addition to a Hadoop connector, MarkLogic 5 includes enhanced capabilities to store, tag and analyze textual data and digital interactive media. The latest release of MarkLogic also features superior database replication capabilities and functionality for monitoring the performance of enterprise level Big Data installations.

The release of MarkLogic 5 testifies to the explosion of commercial interest in non-relational databases for storing and mining unstructured data. Microsoft’s Big Data platform plans to integrate Hadoop with Windows Server and Windows Azure, with connectors to SQL Server 2012. Oracle, meanwhile, recently revealed the basic components of its Big Data appliance that features Hadoop in addition to its Oracle NoSQL database.

Battle for Big Data Heats Up As Microsoft and Oracle Announce Hadoop-based Products

The battle for market share in the big data space is officially underway, with passion. At last week’s Professional Association for SQL Server Summit (PASS), Microsoft announced plans to develop a platform for big data processing and analytics based on Hadoop, the open source software framework that operates under an Apache license. Microsoft’s announcement comes roughly ten days after Oracle’s unveiling of its Big Data Appliance that provides enterprise level capabilities to process structured and unstructured data.

Key features of Oracle’s Big Data Appliance include the following:

–Apache Hadoop
–Oracle NoSQL Database Enterprise Edition
–Oracle Data Integrator Application Adapter for Hadoop
–Oracle Loader for Hadoop
–Open source distribution of R

–Oracle’s Exadata x86 clusters (Oracle Exadata Database Machine, Oracle Exalytics Business Intelligence Machine)

Oracle’s hardware supports the Oracle 11g R2 database alongside Oracle’s Red Hat Enterprise Linux version and virtualization based on the Xen hypervisor. The company’s announcement of its plans to leverage a NoSQL database represented an abrupt about face of an earlier Oracle position that discredited the significance of NoSQL. In May, Oracle published a whitepaper Debunking the NoSQL Hype that downplayed the enterprise level capability of NoSQL deployments.

Microsoft’s forthcoming Big Data platform features the following:

–Hadoop for Windows Server and Azure
–Hadoop connectors for SQL Server and SQL Parallel Data Warehouse
–Hive ODBC drivers for users of Microsoft Business Intelligence applications

Microsoft revealed a strategic partnership with Yahoo spinoff Hortonworks to integrate Hadoop with Windows Server and Windows Azure. Microsoft’s decision not to leverage NoSQL and use instead a Windows based version of Hadoop for SQL Server 2012 constitutes the key difference between Microsoft and Oracle’s Big Data platforms. The entry of Microsoft and Oracle into the Big Data space suggests that the market is ready to explode as government and private sector agencies increasingly find value in unlocking business value from unstructured data such as emails, log files, twitter feeds and text-centered data. IBM and EMC hold the early market share lead but competition is set to intensify, particularly given the recent affirmation handed to NoSQL by tech giant Oracle.