MarkLogic 5 Features Hadoop Connector For Enhanced Big Data Analytics

With the November 1 release of MarkLogic 5, MarkLogic consolidated its position in the Big Data space by announcing support for Hadoop, the Apache open source software framework for analyzing massive amounts of structured and unstructured data. For over a decade, MarkLogic has delivered analytics that enable actionable intelligence on data for organizations such as JP Morgan Chase, Lexis Nexis and the U.S. Army. MarkLogic 5 features a connector for Hadoop that integrates Hadoop’s capabilities for processing petabytes of data with MarkLogic’s proprietary applications for analyzing Big Data. In addition to a Hadoop connector, MarkLogic 5 includes enhanced capabilities to store, tag and analyze textual data and digital interactive media. The latest release of MarkLogic also features superior database replication capabilities and functionality for monitoring the performance of enterprise level Big Data installations.

The release of MarkLogic 5 testifies to the explosion of commercial interest in non-relational databases for storing and mining unstructured data. Microsoft’s Big Data platform plans to integrate Hadoop with Windows Server and Windows Azure, with connectors to SQL Server 2012. Oracle, meanwhile, recently revealed the basic components of its Big Data appliance that features Hadoop in addition to its Oracle NoSQL database.

IBM Releases Big Data Software On SmartCloud; Cognos for iPad

On Monday, IBM announced the release of the Infosphere BigInsights application for analyzing massive volumes of structured and unstructured data on its SmartCloud environment. The SmartCloud release of IBM’s BigInsights application means that IBM beat competitors Oracle and Microsoft in the race to deploy an enterprise grade, cloud based Big Data analytics platform. Over the past month, Oracle and Microsoft have revealed plans to release cloud based Big Data applications that leverage Apache Hadoop, although in the case of both companies, plans for a live release are scheduled for 2012. BigInsights was previously accessed via the IBM Smart Business Development and Test Cloud environment that served as the testing ground for IBM’s SmartCloud which was deployed in April 2011.

IBM developed its Big Data analytics platform because organizations across a number of verticals are drowning in the sea of unstructured data such as Facebook and Twitter feeds, internet searches, log files and emails. IBM’s press release quantified the size of the emerging big data space as follows:

Organizations of all sizes are struggling to keep up with the rate and pace of big data and use it in a meaningful way to improve products, services, or the customer experience. Every day, people create the equivalent of 2.5 quintillion bytes of data from sensors, mobile devices, online transactions, and social networks; so much that 90 percent of the world’s data has been generated in the past two years. Every month people send one billion Tweets and post 30 billion messages on Facebook. Meanwhile, more than 1 trillion mobile devices are in use today and mobile commerce is expected to reach $31 billion by 2016.

IBM customers in the banking, insurance and communications verticals are currently using BigInsights to more effectively understand trends from web analytics, social media feeds, text messages and other forms of unstructured data. The availability of BigInsights via IBM’s SmartCloud is likely to accelerate enterprise adoption of the product given enterprise familiarity with the SmartCloud offering and recent publicity about its October 12 upgrade. The deployment of BigInsights on SmartCloud also gives IBM early traction in the Big Data space, with competition from Amazon Elastic MapReduce from Amazon Web Services, EMC, Teradata and HP. Granted, Oracle and Microsoft are set to join the Big Data party soon, but IBM should have at least six months to consolidate its market positioning ahead of its West coast based competitors. The enterprise version of BigInsights is priced at 60 cents per cluster per hour whereas the basic version is free.

Key features of enterprise level IBM Infosphere BigInsights include the following:

• Advanced text analytics to mine massive amounts of textual data
• A spreadsheet-like interface called BigSheets that allows users to create and deploy analytics without writing code
• Web-based management console
• Jaql, a query language for querying structured and unstructured data through an interface that resembles SQL

In tandem with the release of BigInsights on the SmartCloud, IBM announced the availability of IBM Cognos Mobile on the iPad and iPhone. iPad users can now leverage Cognos to run analytics on data and obtain access to a suite of visually rich dashboards. The combination of Cognos on the iPad and BigInsights clearly indicates that portability of access to data analytics constitutes a key component of IBM’s big data strategy. The big question now concerns how Oracle and Microsoft will differentiate themselves from BigInsights in their respective, forthcoming Big Data offerings.

Battle for Big Data Heats Up As Microsoft and Oracle Announce Hadoop-based Products

The battle for market share in the big data space is officially underway, with passion. At last week’s Professional Association for SQL Server Summit (PASS), Microsoft announced plans to develop a platform for big data processing and analytics based on Hadoop, the open source software framework that operates under an Apache license. Microsoft’s announcement comes roughly ten days after Oracle’s unveiling of its Big Data Appliance that provides enterprise level capabilities to process structured and unstructured data.

Key features of Oracle’s Big Data Appliance include the following:

•Software
–Apache Hadoop
–Oracle NoSQL Database Enterprise Edition
–Oracle Data Integrator Application Adapter for Hadoop
–Oracle Loader for Hadoop
–Open source distribution of R

•Hardware
–Oracle’s Exadata x86 clusters (Oracle Exadata Database Machine, Oracle Exalytics Business Intelligence Machine)

Oracle’s hardware supports the Oracle 11g R2 database alongside Oracle’s Red Hat Enterprise Linux version and virtualization based on the Xen hypervisor. The company’s announcement of its plans to leverage a NoSQL database represented an abrupt about face of an earlier Oracle position that discredited the significance of NoSQL. In May, Oracle published a whitepaper Debunking the NoSQL Hype that downplayed the enterprise level capability of NoSQL deployments.

Microsoft’s forthcoming Big Data platform features the following:

–Hadoop for Windows Server and Azure
–Hadoop connectors for SQL Server and SQL Parallel Data Warehouse
–Hive ODBC drivers for users of Microsoft Business Intelligence applications

Microsoft revealed a strategic partnership with Yahoo spinoff Hortonworks to integrate Hadoop with Windows Server and Windows Azure. Microsoft’s decision not to leverage NoSQL and use instead a Windows based version of Hadoop for SQL Server 2012 constitutes the key difference between Microsoft and Oracle’s Big Data platforms. The entry of Microsoft and Oracle into the Big Data space suggests that the market is ready to explode as government and private sector agencies increasingly find value in unlocking business value from unstructured data such as emails, log files, twitter feeds and text-centered data. IBM and EMC hold the early market share lead but competition is set to intensify, particularly given the recent affirmation handed to NoSQL by tech giant Oracle.

Amazon Web Services Follows Microsoft by Eliminating Inbound Data Charges

Amazon Web Services (AWS) promised to eliminate inbound data fees starting July 1 in a move that matched Microsoft’s recent announcement of the same with respect to its Microsoft Azure platform. Moreover, AWS slashed outbound data prices for up to 10 terabytes of outbound traffic per month from 15 cents to 12 cents per GB. After 10 terabytes of outbound data transfer within a month, the next 40 terabytes per month have been discounted from 11 cents to 9 cents (total: 50 terabytes) per GB. And the next 100 terabytes of outbound data transfer per month (total: 150 terabytes) will be discounted from 9 cents to 7 cents per GB. In a blog post, Amazon Web Services remarked: “There is no charge for inbound data transfer across all services in all regions. That means, you can upload petabytes of data without having to pay for inbound data transfer fees. On outbound transfer, you will save up to 68% depending on volume usage. For example, if you were transferring 10 TB in and 10 TB out a month, you will save 52% with the new pricing. If you were transferring 500 TB in and 500 TB out a month, you will save 68% on transfer with the new pricing.”

Microsoft announced its intention last week to eliminate inbound data transfer fees in the context of the case of Press Association Sport, a partner of the Press Association, the national news agency of the UK. Given that the Press Association Sport planned to upload “large amounts of text, data and multimedia content every month,” into Windows Azure, the CTO of the Press Association remarked on the benefits of free inbound data transfers as follows: “Estimating the amount of data we will upload every month is a challenge for us due to the sheer volume of data we generate, the fluctuations of volume month on month and the fact that it grows over time. Eliminating the cost of inbound data transfer made the project easier to estimate and removes a barrier or uploading as much data as we think we may need.” Amazon followed suit a week after Microsoft’s June 22 announcement. In a June 29 blog post, AWS CTO Werner Vogels indicated future price decreases from AWS were forthcoming as the company scaled and rendered its operations more efficient.

Microsoft Releases Office 365 to Stake Claim to Cloud Productivity Software Space

Microsoft Corporation consolidated its position in the productivity software market on Tuesday with the market release of Office 365, the online version of its Microsoft Office suite of software applications such as Microsoft Word, Excel, PowerPoint and OneNote. In releasing the market version of a Beta product circulated in the fall of 2010, Microsoft goes head to head with Google Apps in the competition for enterprise market share from businesses seeking productivity and collaboration tools. Although Office 365 has the potential to cannibalize sales of the popular desktop Microsoft Office suite, Microsoft predicts net revenue from its productivity software will increase as a result of business subscriptions from small to medium sized businesses. Speaking of the product’s target market, Microsoft CEO Stave Ballmer noted: “Office 365 levels the playing field, giving small and midsize businesses powerful collaboration tools that have given big businesses an edge for years.”

In a manner similar to Google Apps for Business, Office 365 allows multiple users to simultaneously edit the same documents and access them from web enabled devices including smartphones. Office 365 is priced anywhere from $2/month per user for email services to $24/month per user for the most powerful versions of the Office productivity suite and Exchange, SharePoint and Lync Online. Office 365 for Small Businesses, intended for a maximum of 50 users with “minimal IT resources,” is aggressively priced at $6/month in comparison to Google Apps, at $5/month per user. Microsoft’s entrance into the cloud based productivity software space was not lost on Google. Shan Sinha, Google Apps Product Manager, noted in a Google blog post that Office 365 was designed for individuals whereas Google Apps was conceived for teams; that Office 365 is optimized for Windows PCs whereas Google Apps works well on virtually any web enabled platform; and Google Apps is optimized for cloud based deployment whereas Office 365 represents “legacy, desktop software,” that has been transferred to a data center and labeled “cloud.” “Apps,” Sinha notes, “was born for the web and we’ve been serving hundreds of millions of users for years.”

Analysts are divided as to who holds the advantage between Google and Microsoft in the productivity software space. On one hand, Google holds a competitive edge both in terms of first mover advantage and the free version of its productivity suite, Google Docs. Microsoft, nevertheless, dominates the productivity software space with 90% of the market and a customer base that is familiar with and loyal to its software. That said, questions remain as to whether Microsoft can ameliorate the problems that caused outages of the precursor to Office 365, the Business Productivity Online Suite (BPOS). Google clearly has more experience and skill with large scale cloud deployments although it remains to be seen how convincingly its productivity suite can gain traction in the enterprise space.