Big Data

Pivotal Open Sources Its Big Data Suite And Announces Partnership With Hortonworks

Pivotal recently announced the open sourcing of key components of its Pivotal Big Data Suite. Parts of the Pivotal Big Data Suite that will be outsourced include the MPP Pivotal Greenplum Database, Pivotal HAWQ and Pivotal GemFire, the NoSQL in-memory database. Pivotal’s decision to open source the core of its Big Data suite builds upon its success monetizing the Cloud Foundry platform and intends to accelerate the development of analytics applications that leverage big data and real-time streaming big data sets. The open sourcing of Greenplum, Pivotal’s SQL on Hadoop platform HAWQ and GemFire renders Pivotal’s principal analytics and database platforms more readily accessible to the developer community and encourages enterprises to experiment with Pivotal’s solutions. Sundeep Madra, VP of the Data Product Group, Pivotal, remarked on Pivotal’s decision to open source its Big Data suite as follows:

Pivotal Big Data Suite is a major milestone in the path to making big data truly accessible to the enterprise. By sharing Pivotal HD, HAWQ, Greenplum Database and GemFire capabilities with the open source community, we are contributing to the market as a whole the necessary components to build solutions that make up a next generation data infrastructure. Releasing these technologies as open source projects will only help accelerate adoption and innovation for our customers.

Pivotal’s announcement of the open sourcing of its Big Data suite comes in tandem with a strategic alliance aimed at synergistically maximizing the competencies of both companies to deliver best-in-class Hadoop capabilities for the enterprise. The partnership with Hortonworks includes product roadmap alignment, integration and the implementation of a unified vision with respect to leveraging the power of Apache Hadoop to facilitate the capability to derive actionable business intelligence on a scale rarely performed within the contemporary enterprise. In conjunction with the collaboration with Hortonworks, Pivotal revealed its participation in the Open Data Platform, an organization dedicated toward promoting the use of Big Data technologies centered around Apache Hadoop whose Platinum members include GE, Hortonworks, IBM, Infosys, Pivotal and SAS. The Open Data Platform intends to ensure components of the Hadoop ecosystem such as Apache Storm, Apache Spark and Hadoop-analytics applications integrate with and optimally support one another.

All told, Pivotal’s decision to open source its Big Data suite represents a huge coup for the Big Data analytics community at large insofar as organizations now have access to some of the most sophisticated Hadoop-analytics tools in the industry at no charge. More striking, however, is the significance of Pivotal’s alignment with Hortonworks, which stands to tilt the balance of the struggle for Hadoop market share toward Hortonworks and away from competitors Cloudera and MapR, at least for the time being. Thus far, Cloudera has enjoyed notable traction in the financial services sector and within the enterprise more generally, but the enriched analytics available to the Hortonworks Data Platform by means of the partnership with Pivotal promise to render Hortonworks a more attractive solution, particularly for analytics-intensive use cases and scenarios. Regardless, Pivotal’s strategic evolution as represented by its open source move, its collaboration with Hortonworks and leadership position in the Open Data Platform constitute a seismic moment in Big Data history wherein the big data world shakes as the world’s most sophisticated big data analytics firm qua Pivotal unites with Hortonworks, the company responsible for the first publicly traded Hadoop distribution. The obvious question now is how Cloudera and MapR will respond to the Open Data Platform and the extent to which Pivotal’s partnership with Hadoop distributions remains exclusive to, or focused around Hortonworks in the near future.

Categories: Big Data, Hadoop, Hortonworks, Pivotal

HP Releases Predictive Big Data Analytics Platform Featuring Distributed R

On Tuesday, HP announced details of HP Haven Predictive Analytics, a platform that delivers machine learning and statistical analysis for large-scale datasets. HP Haven Predictive Analytics features distributed R, an analytics platform based on the R programming language designed to tackle the most complex Big Data predictive analytics tasks in the industry. Moreover, the HP Haven Predictive Analytics platform boasts support for SQL and HP Vertica in addition to preconfigured algorithms that allow developers to leverage out of the box, R-based analytics. The hallmark of the offering, however, consists of the distributed R analytical engine that leverages parallel R processing to allow the power of R’s predictive analytics to tackle big data sets. The conjunction of the platform’s data acceleration functionality with its distributed use of the open source R programming language stands to improve analytic performance on large datasets and enable statisticians to derive actionable business intelligence from petabytes of data with speed and analytic sophistication. As such, HP Haven Predictive Analytics augments the analytic power of the SQL on Hadoop Vertica platform by delivering a Big Data predictive analytics platform capable of analyzing structured and unstructured data via the cloud or an on premise deployment.

Categories: Big Data, HP | Tags: , ,

Cloudera And Cask Partner To Align Cask’s Application Development Platform With Cloudera’s Hadoop Product Portfolio

Cloudera and Cask recently announced a strategic collaboration marked by a commitment to integrate the product roadmaps of both companies into a unified vision based around the goal of empowering developers to more easily build and deploy applications using Hadoop. As part of the collaboration, Cloudera made an equity investment in Cask, the company formerly known as Continuity. Cask’s flagship product consists of the Cask Data Application Platform (CDAP), an application platform used to streamline and simplify Hadoop-based application development in addition to delivering operational tools for integrating application components and performing runtime services. The integration of Cask’s open source Cask Data Application Platform with Cloudera’s open source Hadoop distribution represents a huge coup for Cask insofar as its technology stands to become tightly integrated with one of the most popular Hadoop distributions in the industry and correspondingly vie for potential acquisition by Cloudera as its product develops further. Cloudera, on the other hand, stands to gain from Cask’s progress in building a platform for facilitating Big Data application development that runs natively within a Hadoop infrastructure. By aligning its product roadmap with Cask, Cloudera adds yet another feather to its cap vis-à-vis tools and platforms within its ecosystem that enhance and accelerate the experience of Hadoop adoption. Overall, the partnership strengthens Cloudera’s case for going public by illustrating the astuteness and breadth of its vision when it comes to strategic partners and collaborators such as Cask, not to mention the business and technological benefits of the partnership. Expect Cloudera to continue aggressively building out its partner ecosystem as it hurtles toward an IPO that it may well be already preparing, at least as reported in Venturebeat.

Categories: Big Data, Cloudera, Hadoop | Tags: ,

Qumulo Announces $40M In Series B Funding For Enterprise Data Storage Solution In Stealth

Seattle-based enterprise data storage startup Qumulo today announced the finalization of $40M in Series B funding in a round led by Kleiner Perkins Caufield & Byers with additional participation from existing investors Highland Capital, Madrona Venture Group and Valhalla Partners. The funding will be used to accelerate product development and expand Qumulo’s sales and marketing efforts. Still in stealth, Qumulo tackles data management problems specific to storage infrastructures for massive amounts of data. The company aspires to become the “company the world trusts to store, manage, and curate its data forever” as noted in its mission statement. In a phone interview with Cloud Computing Today, Qumulo’s CEO Peter Godman remarked that the condition of possibility for the retention of “data forever” involves the ever depreciating cost of storage-related hardware infrastructures. That said, industry-wide and global cultural expectations to retain data forever create a veritable constellation of problems related to data retrieval and archiving of billions or trillions of files within cost-effective, scalable storage platforms. Today’s funding raise brings the total capital raised by Qumulo to $67M. Sujal Patel, founder of Isilon, the storage vendor acquired by EMC for $2.5 billion, will join Qumulo’s board of directors and complement a leadership team that includes storage experts from Amazon Web Services, Microsoft and Google. As Qumulo emerges from stealth to provide more details of its product offering, the industry should expect dramatic innovation in enterprise-grade scale-out NAS that forthrightly tackles thorny problems related to the curation and organization of massive amounts of data on storage infrastructures designed for big data sets.

Categories: Big Data, Qumulo, Venture Capital | Tags: ,

Datapipe’s Acquisition Of GoGrid Underscores The Industry Trend Of The Intersection Of Cloud And Big Data

Managed hybrid cloud IT solution provider Datapipe recently announced the acquisition of GoGrid, a leader in facilitating the effective operationalization of Big Data for cloud deployments. While GoGrid boasts over a decade of experience in managed cloud and dedicated cloud hosting, the company recently added a slew of Big Data offerings to its product line including NoSQL database product offerings and a 1-button deployment process, in addition to a partnership with Cloudera to accelerate Hadoop deployments for the enterprise. Robb Allen, CEO of Datapipe, commented on the significance of Datapipe’s acquisition of GoGrid as follows:

GoGrid has made it easy for companies to stand up Big Data solutions quickly. Datapipe customers will achieve significant value from the speed at which we can now create new Big Data projects in the cloud. This acquisition advances Datapipe’s strategy to help our enterprise clients architect, deploy and manage multi-cloud hybrid IT solutions.

Here, Allen remarks on the way in which GoGrid’s success in streamlining the implementation of Big Data solutions enhances Datapipe’s ability to offer enterprise customers Big Data solutions in conjunction with managed cloud hosting solutions. As such, Datapipe stands poised to consolidate its leadership in the space of cloud vendors offering Big Data solutions to enterprise customers given that cloud adoption has significantly outpaced Big Data in the enterprise to date. By acquiring GoGrid, Datapipe positions itself to offer its customers the limitless scalability of the cloud in addition to the infrastructure to store petabytes of data. The adoption of cloud-based big data solutions enables customers to take advantage of the potential for running analytics in parallel on transactional and non-transactional datasets alike to derive insights that draw upon the union of financial, operational, marketing, sales and third party data. As a result, Datapipe’s acquisition of GoGrid cements its already strong marketing positioning in the nascent but about to explode space marked by the intersection of cloud computing and Big Data.

Categories: Big Data, Datapipe, GoGrid

Conversation With John Fanelli, DataTorrent’s VP of Marketing, Regarding Analytics On Streaming Big Data

Cloud Computing Today recently spoke to John Fanelli, DataTorrent’s VP of Marketing, about Big Data, real-time analytics on Hadoop, DataTorrent RTS 2.0 and the challenges specific to performing analytics on streaming Big Data sets. Fanelli commented on the market reception of DataTorrent’s flagship product DataTorrent RTS 2.0 and the mainstream adoption of Big Data technologies.

1. Cloud Computing Today: Tell us about the market landscape for real-time analytics on streaming Big Data and describe DataTorrent’s positioning within that landscape. How do you see the market for real-time analytics evolving?

John Fanelli (DataTorrent): Data is being generated today in not only unprecedented volume and variety, but also velocity. Human created data is being surpassed by automatically generated data (sensor data, mobile devices and transaction data for example) at a very rapid pace. The term we use for this is fast big data. Fast big data can provide companies with valuable business insight, but only if they act on them immediately. If they don’t, the business value declines as the data ages.

As a result of this business opportunity, streaming analytics is rapidly becoming the norm as enterprises rush to deliver differentiated offerings to generate revenue or create operational automated efficiencies to save cost. But it’s not just fast big data alone; it’s big data in general. Organizations have plenty of big data already in their Enterprise Data Warehouse (EDW) that is used to enrich and provide greater context to fast big data. Some examples of data that drives business decisions include customer information, location and purchase history.

DataTorrent is leading the way in meeting customer requirements in this market by providing extremely scalable ingestion of data from many sources at different rates (“data in motion” and “data at rest”), combined with fault tolerant, high performing analytics; flexible Java-based action and alerting, delivered in an easy to use and operate product offering, DataTorrent RTS.

The market will continue to evolve toward making analytics easier to use across the enterprise (think non-IT users), cloud-based deployments and even pre-built blueprints for “enterprise configurable” applications.

2. Cloud Computing Today: How would you describe the reception of DataTorrent RTS 2.0? What do customers like most about the product?

John Fanelli (DataTorrent):Customer feedback DataTorrent RTS 2.0 has been phenomenal. There are many aspects of the product that are getting rave reviews. I have to call out that developers have reacted very positively to the Hadoop Distributed Hast Table (HDHT) feature as it provides them with a distributed, fault-tolerant “application scratchpad,” that doesn’t require any external technology or databases. Of course, the marquee features that have the data scientist community abuzz are Project DaVinci (visual streaming application builder) and Project Michelangelo (visual data dashboard). Both enable quick experimentation over real-time data and will emerge from Private Beta over the coming months.

3. Cloud Computing Today: How would you describe the differentiation of DataTorrent RTS from Apache Spark and Apache Storm?

John Fanelli (DataTorrent):DataTorrent provides a complete enterprise-grade solution, not just an event-streaming platform. DataTorrent RTS includes an enterprise-grade platform, a broad set of pre-built operators and visual development and visualization tools. Enterprises are looking for what DataTorrent calls a SHARPS platform. SHARPS is an acronym for Scalability, Highly Availability, Performance and Security. In each of the SHARPS categories, DataTorrent RTS is superior.

4. Cloud Computing Today: What challenges do you foresee for Big Data achieving mainstream adoption in 2015?

John Fanelli (DataTorrent): Fast big data is gaining momentum! Every day I speak with customers and prospects about their fast big data, the use-case requirements and the projected business impact. The biggest challenge they share with me is that they are looking to move faster than they are able due to existing projects and technical skills on their team. DataTorrent RTS’ ease of use and operator libraries supports almost any input/output source/sink and provides pre-built analytics modules to address those challenges.

Categories: Big Data, DataTorrent

Treasure Data Closes $15M In Series B Funding For Fully Managed, Cloud-Based Big Data Platform

This week, Treasure Data announced the finalization of $15M in Series B funding led by Scale Venture Partners. The funding will be used to accelerate the expansion of Treasure Data’s proprietary, cloud-based platform for acquiring, storing and analyzing massive amounts of data for use cases that span industries such as gaming, the internet of things and digital media. Treasure Data’s Big Data platform specializes in acquiring and processing streaming big data sets that are subsequently stored in its cloud-based infrastructure. Notable about the Treasure Data platform is that it offers customers a fully managed solution for storing streaming big data that can ingest billions of records per day, in a non-HDFS (Hadoop) format. Current customers include Equifax, Pebble, GREE, Wish.com and Pioneer, the last of which leverages the Treasure Data platform for automobile-related telematics use cases. In addition to Scale Venture Partners, all existing board members and their associated funds participated in the Series B capital raise, including Jerry Yang’s AME Venture Fund.

Categories: Big Data, Treasure Data

Blog at WordPress.com. The Adventure Journal Theme.