Cloudera And Cask Partner To Align Cask’s Application Development Platform With Cloudera’s Hadoop Product Portfolio

Cloudera and Cask recently announced a strategic collaboration marked by a commitment to integrate the product roadmaps of both companies into a unified vision based around the goal of empowering developers to more easily build and deploy applications using Hadoop. As part of the collaboration, Cloudera made an equity investment in Cask, the company formerly known as Continuity. Cask’s flagship product consists of the Cask Data Application Platform (CDAP), an application platform used to streamline and simplify Hadoop-based application development in addition to delivering operational tools for integrating application components and performing runtime services. The integration of Cask’s open source Cask Data Application Platform with Cloudera’s open source Hadoop distribution represents a huge coup for Cask insofar as its technology stands to become tightly integrated with one of the most popular Hadoop distributions in the industry and correspondingly vie for potential acquisition by Cloudera as its product develops further. Cloudera, on the other hand, stands to gain from Cask’s progress in building a platform for facilitating Big Data application development that runs natively within a Hadoop infrastructure. By aligning its product roadmap with Cask, Cloudera adds yet another feather to its cap vis-à-vis tools and platforms within its ecosystem that enhance and accelerate the experience of Hadoop adoption. Overall, the partnership strengthens Cloudera’s case for going public by illustrating the astuteness and breadth of its vision when it comes to strategic partners and collaborators such as Cask, not to mention the business and technological benefits of the partnership. Expect Cloudera to continue aggressively building out its partner ecosystem as it hurtles toward an IPO that it may well be already preparing, at least as reported in Venturebeat.

Qumulo Announces $40M In Series B Funding For Enterprise Data Storage Solution In Stealth

Seattle-based enterprise data storage startup Qumulo today announced the finalization of $40M in Series B funding in a round led by Kleiner Perkins Caufield & Byers with additional participation from existing investors Highland Capital, Madrona Venture Group and Valhalla Partners. The funding will be used to accelerate product development and expand Qumulo’s sales and marketing efforts. Still in stealth, Qumulo tackles data management problems specific to storage infrastructures for massive amounts of data. The company aspires to become the “company the world trusts to store, manage, and curate its data forever” as noted in its mission statement. In a phone interview with Cloud Computing Today, Qumulo’s CEO Peter Godman remarked that the condition of possibility for the retention of “data forever” involves the ever depreciating cost of storage-related hardware infrastructures. That said, industry-wide and global cultural expectations to retain data forever create a veritable constellation of problems related to data retrieval and archiving of billions or trillions of files within cost-effective, scalable storage platforms. Today’s funding raise brings the total capital raised by Qumulo to $67M. Sujal Patel, founder of Isilon, the storage vendor acquired by EMC for $2.5 billion, will join Qumulo’s board of directors and complement a leadership team that includes storage experts from Amazon Web Services, Microsoft and Google. As Qumulo emerges from stealth to provide more details of its product offering, the industry should expect dramatic innovation in enterprise-grade scale-out NAS that forthrightly tackles thorny problems related to the curation and organization of massive amounts of data on storage infrastructures designed for big data sets.

Datapipe’s Acquisition Of GoGrid Underscores The Industry Trend Of The Intersection Of Cloud And Big Data

Managed hybrid cloud IT solution provider Datapipe recently announced the acquisition of GoGrid, a leader in facilitating the effective operationalization of Big Data for cloud deployments. While GoGrid boasts over a decade of experience in managed cloud and dedicated cloud hosting, the company recently added a slew of Big Data offerings to its product line including NoSQL database product offerings and a 1-button deployment process, in addition to a partnership with Cloudera to accelerate Hadoop deployments for the enterprise. Robb Allen, CEO of Datapipe, commented on the significance of Datapipe’s acquisition of GoGrid as follows:

GoGrid has made it easy for companies to stand up Big Data solutions quickly. Datapipe customers will achieve significant value from the speed at which we can now create new Big Data projects in the cloud. This acquisition advances Datapipe’s strategy to help our enterprise clients architect, deploy and manage multi-cloud hybrid IT solutions.

Here, Allen remarks on the way in which GoGrid’s success in streamlining the implementation of Big Data solutions enhances Datapipe’s ability to offer enterprise customers Big Data solutions in conjunction with managed cloud hosting solutions. As such, Datapipe stands poised to consolidate its leadership in the space of cloud vendors offering Big Data solutions to enterprise customers given that cloud adoption has significantly outpaced Big Data in the enterprise to date. By acquiring GoGrid, Datapipe positions itself to offer its customers the limitless scalability of the cloud in addition to the infrastructure to store petabytes of data. The adoption of cloud-based big data solutions enables customers to take advantage of the potential for running analytics in parallel on transactional and non-transactional datasets alike to derive insights that draw upon the union of financial, operational, marketing, sales and third party data. As a result, Datapipe’s acquisition of GoGrid cements its already strong marketing positioning in the nascent but about to explode space marked by the intersection of cloud computing and Big Data.

Conversation With John Fanelli, DataTorrent’s VP of Marketing, Regarding Analytics On Streaming Big Data

Cloud Computing Today recently spoke to John Fanelli, DataTorrent’s VP of Marketing, about Big Data, real-time analytics on Hadoop, DataTorrent RTS 2.0 and the challenges specific to performing analytics on streaming Big Data sets. Fanelli commented on the market reception of DataTorrent’s flagship product DataTorrent RTS 2.0 and the mainstream adoption of Big Data technologies.

1. Cloud Computing Today: Tell us about the market landscape for real-time analytics on streaming Big Data and describe DataTorrent’s positioning within that landscape. How do you see the market for real-time analytics evolving?

John Fanelli (DataTorrent): Data is being generated today in not only unprecedented volume and variety, but also velocity. Human created data is being surpassed by automatically generated data (sensor data, mobile devices and transaction data for example) at a very rapid pace. The term we use for this is fast big data. Fast big data can provide companies with valuable business insight, but only if they act on them immediately. If they don’t, the business value declines as the data ages.

As a result of this business opportunity, streaming analytics is rapidly becoming the norm as enterprises rush to deliver differentiated offerings to generate revenue or create operational automated efficiencies to save cost. But it’s not just fast big data alone; it’s big data in general. Organizations have plenty of big data already in their Enterprise Data Warehouse (EDW) that is used to enrich and provide greater context to fast big data. Some examples of data that drives business decisions include customer information, location and purchase history.

DataTorrent is leading the way in meeting customer requirements in this market by providing extremely scalable ingestion of data from many sources at different rates (“data in motion” and “data at rest”), combined with fault tolerant, high performing analytics; flexible Java-based action and alerting, delivered in an easy to use and operate product offering, DataTorrent RTS.

The market will continue to evolve toward making analytics easier to use across the enterprise (think non-IT users), cloud-based deployments and even pre-built blueprints for “enterprise configurable” applications.

2. Cloud Computing Today: How would you describe the reception of DataTorrent RTS 2.0? What do customers like most about the product?

John Fanelli (DataTorrent):Customer feedback DataTorrent RTS 2.0 has been phenomenal. There are many aspects of the product that are getting rave reviews. I have to call out that developers have reacted very positively to the Hadoop Distributed Hast Table (HDHT) feature as it provides them with a distributed, fault-tolerant “application scratchpad,” that doesn’t require any external technology or databases. Of course, the marquee features that have the data scientist community abuzz are Project DaVinci (visual streaming application builder) and Project Michelangelo (visual data dashboard). Both enable quick experimentation over real-time data and will emerge from Private Beta over the coming months.

3. Cloud Computing Today: How would you describe the differentiation of DataTorrent RTS from Apache Spark and Apache Storm?

John Fanelli (DataTorrent):DataTorrent provides a complete enterprise-grade solution, not just an event-streaming platform. DataTorrent RTS includes an enterprise-grade platform, a broad set of pre-built operators and visual development and visualization tools. Enterprises are looking for what DataTorrent calls a SHARPS platform. SHARPS is an acronym for Scalability, Highly Availability, Performance and Security. In each of the SHARPS categories, DataTorrent RTS is superior.

4. Cloud Computing Today: What challenges do you foresee for Big Data achieving mainstream adoption in 2015?

John Fanelli (DataTorrent): Fast big data is gaining momentum! Every day I speak with customers and prospects about their fast big data, the use-case requirements and the projected business impact. The biggest challenge they share with me is that they are looking to move faster than they are able due to existing projects and technical skills on their team. DataTorrent RTS’ ease of use and operator libraries supports almost any input/output source/sink and provides pre-built analytics modules to address those challenges.

Treasure Data Closes $15M In Series B Funding For Fully Managed, Cloud-Based Big Data Platform

This week, Treasure Data announced the finalization of $15M in Series B funding led by Scale Venture Partners. The funding will be used to accelerate the expansion of Treasure Data’s proprietary, cloud-based platform for acquiring, storing and analyzing massive amounts of data for use cases that span industries such as gaming, the internet of things and digital media. Treasure Data’s Big Data platform specializes in acquiring and processing streaming big data sets that are subsequently stored in its cloud-based infrastructure. Notable about the Treasure Data platform is that it offers customers a fully managed solution for storing streaming big data that can ingest billions of records per day, in a non-HDFS (Hadoop) format. Current customers include Equifax, Pebble, GREE, Wish.com and Pioneer, the last of which leverages the Treasure Data platform for automobile-related telematics use cases. In addition to Scale Venture Partners, all existing board members and their associated funds participated in the Series B capital raise, including Jerry Yang’s AME Venture Fund.

MapR Announces Selection By MediaHub Australia For Digital Archiving And Analytics

MapR recently announced that MediaHub Australia has deployed MapR to support its digital archive that serves 170+ broadcasters in Australia. MediaHub delivers digital content for broadcasters throughout Australia in conjunction with its strategic partner Contexti. Broadcasters provide MediaHub with segments of programs, live feeds and a schedule that outlines when the program in question should be delivered to its audiences. In addition to scheduled broadcasts, MediaHub offers streaming and video on demand services for a variety of devices. MediaHub’s digital archive automates the delivery of playout services for broadcasters and subsequently minimizes the need for manual intervention from archival specialists. MapR currently manages over 1 petabyte of content for the 170+ channels that it serves, although the size of its digital archive is expected to grow dramatically within the next two years. MapR’s Hadoop-based storage platform also provides an infrastructure that enables analytics on content consumption that help broadcasters make data-driven decisions about what content to air in the future and how to most effectively complement existing content. MediaHub’s usage of MapR illustrates a prominent use case for MapR, namely, the use of Hadoop for storing, delivering and running analytics on digital media. According to Simon Scott, Head of Technology at MediaHub, one of the key reasons MediaHub selected MapR as the big data platform for its digital archive concerned its ability to support commodity hardware.