The battle for market share in the big data space is officially underway, with passion. At last week’s Professional Association for SQL Server Summit (PASS), Microsoft announced plans to develop a platform for big data processing and analytics based on Hadoop, the open source software framework that operates under an Apache license. Microsoft’s announcement comes roughly ten days after Oracle’s unveiling of its Big Data Appliance that provides enterprise level capabilities to process structured and unstructured data.
Key features of Oracle’s Big Data Appliance include the following:
–Oracle NoSQL Database Enterprise Edition
–Oracle Data Integrator Application Adapter for Hadoop
–Oracle Loader for Hadoop
–Open source distribution of R
–Oracle’s Exadata x86 clusters (Oracle Exadata Database Machine, Oracle Exalytics Business Intelligence Machine)
Oracle’s hardware supports the Oracle 11g R2 database alongside Oracle’s Red Hat Enterprise Linux version and virtualization based on the Xen hypervisor. The company’s announcement of its plans to leverage a NoSQL database represented an abrupt about face of an earlier Oracle position that discredited the significance of NoSQL. In May, Oracle published a whitepaper Debunking the NoSQL Hype that downplayed the enterprise level capability of NoSQL deployments.
Microsoft’s forthcoming Big Data platform features the following:
–Hadoop for Windows Server and Azure
–Hadoop connectors for SQL Server and SQL Parallel Data Warehouse
–Hive ODBC drivers for users of Microsoft Business Intelligence applications
Microsoft revealed a strategic partnership with Yahoo spinoff Hortonworks to integrate Hadoop with Windows Server and Windows Azure. Microsoft’s decision not to leverage NoSQL and use instead a Windows based version of Hadoop for SQL Server 2012 constitutes the key difference between Microsoft and Oracle’s Big Data platforms. The entry of Microsoft and Oracle into the Big Data space suggests that the market is ready to explode as government and private sector agencies increasingly find value in unlocking business value from unstructured data such as emails, log files, twitter feeds and text-centered data. IBM and EMC hold the early market share lead but competition is set to intensify, particularly given the recent affirmation handed to NoSQL by tech giant Oracle.