Today, Snowflake Computing emerged from stealth to announce the Snowflake Elastic Data Warehouse, branded as the first data warehouse designed specifically for the needs of cloud computing infrastructures. The Snowflake Elastic Data Warehouse absolves customers of the hassle of managing an on premise data warehouse by providing a data warehouse as a service. Moreover, the platform’s patent pending technology delivers a SQL-compliant data warehouse that elastically scales in conjunction with the number of users, incoming data and workloads. The platform’s “multidimensional elasticity” enables users to load and query data concurrently with a degree of performance that enables rapid ingestion and transformation of data into a relational database structure. Meanwhile, the product decouples storage from computational power in the context of a larger architecture dedicated to optimizing cloud storage and querying capabilities. Customers interact with data stored in the Snowflake Elastic Data Warehouse via a SaaS user interface that delivers query results as illustrated by the screenshot below:
Whereas competing data warehouses from Teradata and Oracle typically deliver on premise data warehouse products, the Snowflake Elastic Data Warehouse allows analysts to run queries such as that shown in the graphic above via a web browser. Today’s announcement of the Snowflake Elastic Data Warehouse coincides with a concomitant announcement of the company’s finalization of $26M in Series B funding led by Redpoint Ventures with additional participation from Sutter Hill Ventures and Wing Ventures. With an extra $26M in cash, Snowflake Computing stands poised to continue disrupting the traditional data warehouse space by delivering a warehousing platform uniquely configured for the cloud. As such, the platform embodies the convergence of cloud computing and Big Data by delivering cloud-optimized performance for the storage and retrieval of petabytes of real-time, streaming Big Data.
Last week, Bloomberg reported that EMC planned to acquire Cloudscaling, the commercial OpenStack vendor, for “less than $50M” according to unidentified sources close to the deal. Founded in 2006, Cloudscaling has raised approximately $10M in capital to date from sources such as Trinity Ventures, Juniper Networks and Seagate. The deal represents the second major acquisition of an OpenStack startup following Cisco’s recent acquisition of Metacloud for an undisclosed sum. By acquiring Cloudscaling, EMC stands to benefit from the addition of Cloudscaling CEO Randy Bias to its cloud services team. Bias, a well known cloud luminary and OpenStack evangelist, will strengthen EMC’s positioning within the IaaS space by giving the company enviable cloud credentials and thought leadership from day one of Cloudscaling’s integration into the EMC ecosystem. Details of the acquisition have yet to be disclosed although, given EMC’s ownership of VMware and Pivotal, the purchase of Cloudscaling is likely to have significant repercussions for the cloud and Big Data spaces, particularly given Joshua McKenty’s recent transition to Pivotal from Piston Cloud Computing as field CTO. With McKenty and Randy Bias on its corporate or spinoff roster, in conjunction with Cloudscaling’s technology and cluster of talent, EMC looks set to make an aggressive move to combine cloud technologies with big data application development in ways done by almost no entity except Amazon Web Services. More generally, EMC’s acquisition of Cloudscaling illustrates how OpenStack startups are suddenly morphing into hot cakes that are likely to be snapped up by larger vendors intent on getting a larger piece of the OpenStack pie. The industry should expect OpenStack-related acquisitions to proliferate as OpenStack matures and the startups that entered the space three to four years ago increasingly productize and perfect the deployment of their IaaS platforms.
DataTorrent recently announced the availability of DataTorrent Real-Time Streaming (RTS) 2.0, which builds on its June release of the 1.0 version of by providing enhanced capabilities to run real-time analytics on streaming Big data sets. DataTorrent RTS 2.0 boasts the ability to ingest data from “any source, any scale and any location” by means of over 75 connectors that allow the platform to ingest varieties of structured and unstructured data. In addition, this release delivers over 450 Java operators that allow data scientists to perform queries and advanced analytics on Big datasets including predictive analytics, statistical analysis and pattern recognition. In a phone interview with John Fanelli, DataTorrent’s VP of Marketing, Cloud Computing Today learned that the platform has begun work on a Private Beta of a product, codenamed Project DaVinci, to streamline the design of applications via a visual interface that allows data scientists to graphically select data sources, analytic operators and their inter-relationship as depicted below:
As the graphic illustrates, DataTorrent Project DaVinci (Private Beta) delivers a unique visual interface for the design of applications that leverage Hadoop-based datasets. Data scientists can take advantage of DataTorrent’s 450+ Java operators and the platform’s advanced analytics functionality to create and debug applications that utilize distributed datasets and streaming Big data. Meanwhile, DataTorrent RTS 2.0 also boasts the ability to store massive amounts of data in a “HDFS based distributed hash table” that facilitates rapid lookups of data for analytic purposes. With version 2.0, DataTorrent continues to disrupt the real-time, Big data analytics space by delivering a platform capable of ingesting data at any scale and running real-time analytics in the broader context of a seductive visual interface for creating Big data analytics applications. DataTorrent competes in the hotly contested real-time Big data analytics space alongside technologies such as Apache Spark, but delivers a range of functionality that supersedes Spark Streaming as illustrated by its application design, advanced analytics and flexible data ingestion capabilities.
On October 14, MongoDB announced major enhancements to its cloud-based MongoDB Management Service (MMS) for managing MongoDB deployments. The most recent version of MMS introduces significant operational efficiencies that streamline and simplify the deployment and subsequent operational management of MongoDB. For example, MMS now enables users to provision MongoDB deployments with one click and configure the resulting infrastructure with minimal manual intervention and decision-making. Moreover, the recent enhancements consolidate the ability to upgrade and downgrade deployments expeditiously as well as to seamlessly scale out deployments to accommodate customer growth. Notably, this release boasts a deeper integration with Amazon Web Services that gives customers greater control over MongoDB deployments on AWS as illustrated by the screenshot below:
As told to Cloud Computing Today by Kelly Stirman, MongoDB’s Director of Products, MongoDB Management Service users can now deploy Amazon Web Services instances from within the MMS infrastructure itself by using the automation agent functionality depicted above. Previously, MMS customers needed to independently provision AWS instances from within the AWS platform, but they can now leverage the deep integration between MMS and AWS to enjoy greater operational efficiencies specific to the deployment of AWS infrastructures containing MongoDB deployments. That said, MMS remains infrastructure agnostic and can work with any public cloud, on premise environment or hybrid cloud infrastructure although, in the case of non-AWS hosting environments, customers will need to independently configure and deploy the underlying infrastructure outside of MMS. The other notable feature of MMS is that it now operates on a freemium model that allows customers to take advantage of its functionality free of charge for up to 8 servers. The freemium model positions MongoDB to significantly expand the range of customers that opt to try out the functionality of MMS and continue hurtling the company in the direction of a lucrative IPO.
Categories: MongoDB, NoSQL
Today, ParStream reveals details of an analytics platform dedicated to the internet of things.
Featuring ParStream DB, a patented database designed to handle the storage of massive amounts of data and real-time analytics, the ParStream platform boasts sub-second query response times, the ability to analyze billions of rows of data and time series analytics that analyze incoming, real-time data feeds in conjunction with historical data. Moreover, ParStream’s Geo-Distributed analytics server enables data analytics and data visualization on distributed datasets in ways that accommodate the disparate data storage infrastructures of contemporary global IT organizations. ParStream’s bevy of analytic offerings sit atop the ParStream DB as illustrated below:
As the graphic illustrates, ParStream’s internet of things analytics platform delivers a diverse range of analytic engines that operate upon massive volumes of data resulting from appliances, wearable devices, automobiles and sensors. Moreover, the platform boasts data acquisition qua ETL/streaming technologies and data visualization technology that collectively delivers a turnkey solution for internet of things use cases. As told to Cloud Computing Today in an interview with ParStream’s CEO, Peter Jensen, the ParStream solution absolves customers of the necessity of cobbling together disparate technologies that often struggle to integrate with one another by delivering instead a holistic solution for internet of things analytics that has few counterparts in the industry to date. The conjunction of the platform’s database and advanced analytics technologies renders it especially well suited for the massive, high velocity datasets specific to the internet of things space, particularly given its real-time analytic functionality on streaming datasets and capability to change the direction of its analytics as data evolves. More generally, the platform illustrates the Big Data industry’s increasing preoccupation with real time analytics on massive, streaming datasets and the concomitant challenges associated with storing incoming data and the results of analytic operations on the data in question.
Salesforce is set to announce details of Wave, an analytics and business intelligence product which it will formally reveal at its Dreamforce conference this week. Wave provides Salesforces users with data visualization functionality for data stored in Salesforce and elsewhere and, as such, constitutes a crucial addition to the Salesforce portfolio given that it had previously depended on partnerships with business intelligence vendors to enable its customers to run analytics and visualizations of their data. Wave builds on Salesforce’s acquisition of EdgeSpring last year given that EdgeSpring provides “search-indexing capabilities for Wave” as told to Anna Rosenman, Salesforce’s Director of Cloud Analytics, in an interview with VentureBeat. Wave represents a critical addition for Salesforce and promises to change the landscape of Customer Relationship Management platforms (CRM) loaded with analytics tools by revitalizing the ability of Salesforce to play in the hotly contested space of analytics and data visualization. The Wave mobile app for Apple iOS will be available this week as Salesforce reinvents in recognition of the ascendancy of cloud-based predictive analytics, data visualization and machine learning platforms.
Teradata today announced a partnership with enterprise Hadoop vendor Cloudera marked by the optimization of the integration between Teradata’s integrated data warehouse and Cloudera’s enterprise data hub. The collaboration between Teradata and Cloudera streamlines access to multiple data sources by means of the Teradata Unified Data Architecture (UDA). As a result of the integration, the Teradata Unified Data Architecture can access data from the Cloudera enterprise data hub by way of a unified Big Data infrastructure that has the capacity to perform data operations and analytics on massive, heterogeneous datasets featuring structured and unstructured data. As part of today’s announcement, Teradata also revealed details of Cloudera-certified connectors that can integrate with Apache Hadoop. Other components of the UDA that interface with Cloudera’s enterprise data hub include the Teradata QueryGrid, which allows users to pose analytical questions of data in both Teradata’s integrated data warehouse and the Cloudera enterprise data hub, in addition to the Teradata Loom, which enables tracking, exploration, cleansing and transformation of Hadoop files. Today’s announcement of the integration between Teradata’s integrated data warehouse and Cloudera’s enterprise data hub signals an important development in the Big Data space insofar because the alignment of the product roadmaps of the two vendors promises to position Teradata strongly via-a-via the development of Big data analytics and processing functionality. On Cloudera’s side, the partnership renders its enterprise data hub even more compatible with one of the industry’s most respected Big Data analytic platforms and prefigures the inking of even more partnerships between Hadoop and Big Data management vendors as a means of continuing to foster deeper hardware and software integration in the Hadoop management space.