On Tuesday, Metanautix released Metanautix Personal Quest, a product that enables individuals to leverage the power of the Metanautix platform to perform queries on data stored in Hadoop, NoSQL and relational database formats. Individual users can use Personal Quest to perform integrated analytics on data stored in relational and non-relational formats to obtain an integrated view of data stored throughout an organization’s different applications and data repositories. Metanautix allows users to download Personal Quest to their machine and subsequently test the capabilities of the Metanautix data compute engine for an unlimited time period for data limited to a designated size and number of queries. Metanautix Quest’s distributed compute engine enables the joining of SQL and non-SQL data sources without complex ETL processes. The video below shows how the integration of Metanautix Quest and Tableau enables customers to join data from Teradata SQL data to MongoDB NoSQL data to obtain a more granular understanding of sales by product by means of a few simple drag and drop operations. The clip illustrates how Metanautix Quest can execute a distributed join to analyze store sales data stored in a Teradata database to product data stored within MongoDB to enable a comparative analysis of sales across product categories such as books, children, electronics and shoes by month. After a visual review of sales by product category in a Tableau workbook reveals that shoes had a significant impact on overall sales, users can perform another join to drill down on shoe sales by shoe type to learn that men’s shoes and athletic shoes were largely responsible for the spike in sales specific to the shoe category. The distributed join performed by Metanautix Quest on Teradata SQL data and MongoDB NoSQL data facilitates a speedy analysis by means of a user interface that requires neither ETL nor the migration of data to a centralized staging repository. As such, Metanautix Quest radically simplifies data analysis and data visualization given the proliferation of different kinds of datasets in small, mid-size and enterprise-level organizations alike. By giving individual users unlimited time-based access to Metanautix Personal Quest, Metanautix intends to underscore the power of its analytic engine for performing analysis on data stored in sources that include Hadoop, Teradata, MongoDB and other SQL and NoSQL data repositories.
As reported in The Wall Street Journal, Tachyon Nexus, the company that aims to commercialize the open source Tachyon in-memory storage system, has raised $7.5M in Series A funding from Andreessen Horowitz. Tachyon is a memory-centric storage system that epitomizes the contemporary transition away from disk-based storage to in-memory storage. Based on the premise that memory-centric storage is increasingly affordable in comparison with disk-centric storage, Tachyon caches frequently read files in memory to create a “memory-centric, fault-tolerant, distributed storage system” that “enables reliable data sharing at memory-speed across a datacenter” as noted in a blog post by Peter Levine, General Partner of Andreessen Horowitz. Tachyon’s memory-centric storage system improves upon the speed and reliability of file-based storage infrastructures to embrace the requirements of big data applications that require the sharing of massive volumes of data at increasing fast speeds. Tachyon was founded by Haoyuan Li, a U.C. Berkeley doctoral candidate who developed Tachyon at the U.C. Berkeley AMPLab. Tachyon is currently used at over 50 companies and supports Spark and MapReduce as well as data stored in HDFS and NFS formats. Tachyon Nexus, the commercial version of Tachyon, remains in stealth. Meanwhile, Peter Levine joins the board of Tachyon Nexus as a result of the Series A investment to support the development of what Levine envisions “the future of storage” in the form of Tachyon-based storage technology.
Ford has announced that it will partner with Microsoft Azure to automate updates to automobile software such as its Sync 3 infotainment system as well as functionality that enables owners to check battery levels and remotely start, lock, unlock or locate their vehicles. As a result of the partnership with Azure, Ford vehicle owners with Sync entertainment and navigation systems will no longer need to take their cars to the dealership for periodic software upgrades, but can instead leverage the car’s ability to connect to a wireless network to download enhancements to Sync. The Azure-based Ford Service Delivery Network will launch this summer at no extra cost to end users. Use cases enabled by the partnership between Azure and Ford are illustrated below:
Despite Ford’s readiness to use long-time technology partner Microsoft for the purpose of leveraging a public cloud, the Dearborn-based automobile giant prefers to use on-premise infrastructures for more sensitive data such as odometer readings, engine-related system data and performance metrics that reveal details about the operation of the vehicle. Moreover, part of the reason Ford chose Microsoft was because of its willingness to support a hybrid cloud infrastructure marked by an integration between an on premise data center environment and a public cloud such as Azure. As reported in InformationWeek, Microsoft will also help Ford with the processing and analysis of data given the massive amounts of data that stand to be collected for its fleet of electric and non-electric vehicles. Ford’s Fusion electric vehicle, for example, creates 25 GB of data per hour and subsequently requires the application of pre-processing and filtering procedures to reduce the amount of data to a point that renders its aggregation manageable for reporting and analytics purposes. Ford’s larger decision to partner with Azure represents a growing industry trend within the automobile industry to use cloud-based technology to push software updates to vehicles and gather data for compliance and product development reasons that includes the likes of Hyundai and Tesla. The key challenge for Ford, and the automobile industry at large, of course, will hinge on its ability to acquire internet of things-related automobile data and subsequently perform real-time analytics to reduce recalls, fatalities and facilitate more profound enhancements in engineering-related research and development. Details of which Ford vehicles stand to benefit from Azure-powered software delivery this summer have yet to be disclosed.
Alation today announces the finalization of $9M in Series A funding led by Costanoa Venture Capital and Data Collective Venture Capital, with participation from Andreessen Horowitz, Bloomberg Beta and General Catalyst Partners. The funding will be used both for product development and the expansion of Alation’s sales and marketing operations. Still in stealth mode, Alation intends to enhance the ability of organizations to access and retrieve data. Based on the premise that organizations continue to struggle with respect to locating data housed within their own infrastructures, Alation intends to improve data accessibility as well as to help customers more effectively understand the quality and significance of their data. Alation CEO Satyen Sangani remarked on Alation’s value proposition as follows:
Organizations still struggle with quickly finding, understanding and using the right data. Data-driven enterprises have voracious appetites for information and continue to make massive investments in data management platforms like Hadoop and business intelligence software like Tableau. While these technologies help with computation, storage, and visualization, ironically they make it harder to navigate the ocean of data. Alation helps people get to key insights faster.
Although business intelligence platforms such as Tableau facilitate data visualization and analysis, Sangani notes, organizations continue to wrestle with the problem of maneuvering within their “ocean of data” and finding the right data for analysis. Led by a team of founders with prior experience at Oracle, Google and Apple, Alation is gearing up for a general availability release of its platform in mid-2015. As told to Cloud Computing Today by CEO Satyen Sangani, the company’s solution focuses on structured and semi-structured data and claims several large, data-driven companies within its roster of customers. Expect more details regarding Alation to emerge within the upcoming months as it comes out of stealth and takes the wraps off its platform for managing big data complexity and increasing data accessibility more generally.
Pivotal recently announced the open sourcing of key components of its Pivotal Big Data Suite. Parts of the Pivotal Big Data Suite that will be outsourced include the MPP Pivotal Greenplum Database, Pivotal HAWQ and Pivotal GemFire, the NoSQL in-memory database. Pivotal’s decision to open source the core of its Big Data suite builds upon its success monetizing the Cloud Foundry platform and intends to accelerate the development of analytics applications that leverage big data and real-time streaming big data sets. The open sourcing of Greenplum, Pivotal’s SQL on Hadoop platform HAWQ and GemFire renders Pivotal’s principal analytics and database platforms more readily accessible to the developer community and encourages enterprises to experiment with Pivotal’s solutions. Sundeep Madra, VP of the Data Product Group, Pivotal, remarked on Pivotal’s decision to open source its Big Data suite as follows:
Pivotal Big Data Suite is a major milestone in the path to making big data truly accessible to the enterprise. By sharing Pivotal HD, HAWQ, Greenplum Database and GemFire capabilities with the open source community, we are contributing to the market as a whole the necessary components to build solutions that make up a next generation data infrastructure. Releasing these technologies as open source projects will only help accelerate adoption and innovation for our customers.
Pivotal’s announcement of the open sourcing of its Big Data suite comes in tandem with a strategic alliance aimed at synergistically maximizing the competencies of both companies to deliver best-in-class Hadoop capabilities for the enterprise. The partnership with Hortonworks includes product roadmap alignment, integration and the implementation of a unified vision with respect to leveraging the power of Apache Hadoop to facilitate the capability to derive actionable business intelligence on a scale rarely performed within the contemporary enterprise. In conjunction with the collaboration with Hortonworks, Pivotal revealed its participation in the Open Data Platform, an organization dedicated toward promoting the use of Big Data technologies centered around Apache Hadoop whose Platinum members include GE, Hortonworks, IBM, Infosys, Pivotal and SAS. The Open Data Platform intends to ensure components of the Hadoop ecosystem such as Apache Storm, Apache Spark and Hadoop-analytics applications integrate with and optimally support one another.
All told, Pivotal’s decision to open source its Big Data suite represents a huge coup for the Big Data analytics community at large insofar as organizations now have access to some of the most sophisticated Hadoop-analytics tools in the industry at no charge. More striking, however, is the significance of Pivotal’s alignment with Hortonworks, which stands to tilt the balance of the struggle for Hadoop market share toward Hortonworks and away from competitors Cloudera and MapR, at least for the time being. Thus far, Cloudera has enjoyed notable traction in the financial services sector and within the enterprise more generally, but the enriched analytics available to the Hortonworks Data Platform by means of the partnership with Pivotal promise to render Hortonworks a more attractive solution, particularly for analytics-intensive use cases and scenarios. Regardless, Pivotal’s strategic evolution as represented by its open source move, its collaboration with Hortonworks and leadership position in the Open Data Platform constitute a seismic moment in Big Data history wherein the big data world shakes as the world’s most sophisticated big data analytics firm qua Pivotal unites with Hortonworks, the company responsible for the first publicly traded Hadoop distribution. The obvious question now is how Cloudera and MapR will respond to the Open Data Platform and the extent to which Pivotal’s partnership with Hadoop distributions remains exclusive to, or focused around Hortonworks in the near future.
On Tuesday, HP announced details of HP Haven Predictive Analytics, a platform that delivers machine learning and statistical analysis for large-scale datasets. HP Haven Predictive Analytics features distributed R, an analytics platform based on the R programming language designed to tackle the most complex Big Data predictive analytics tasks in the industry. Moreover, the HP Haven Predictive Analytics platform boasts support for SQL and HP Vertica in addition to preconfigured algorithms that allow developers to leverage out of the box, R-based analytics. The hallmark of the offering, however, consists of the distributed R analytical engine that leverages parallel R processing to allow the power of R’s predictive analytics to tackle big data sets. The conjunction of the platform’s data acceleration functionality with its distributed use of the open source R programming language stands to improve analytic performance on large datasets and enable statisticians to derive actionable business intelligence from petabytes of data with speed and analytic sophistication. As such, HP Haven Predictive Analytics augments the analytic power of the SQL on Hadoop Vertica platform by delivering a Big Data predictive analytics platform capable of analyzing structured and unstructured data via the cloud or an on premise deployment.
Cloudera And Cask Partner To Align Cask’s Application Development Platform With Cloudera’s Hadoop Product Portfolio
Cloudera and Cask recently announced a strategic collaboration marked by a commitment to integrate the product roadmaps of both companies into a unified vision based around the goal of empowering developers to more easily build and deploy applications using Hadoop. As part of the collaboration, Cloudera made an equity investment in Cask, the company formerly known as Continuity. Cask’s flagship product consists of the Cask Data Application Platform (CDAP), an application platform used to streamline and simplify Hadoop-based application development in addition to delivering operational tools for integrating application components and performing runtime services. The integration of Cask’s open source Cask Data Application Platform with Cloudera’s open source Hadoop distribution represents a huge coup for Cask insofar as its technology stands to become tightly integrated with one of the most popular Hadoop distributions in the industry and correspondingly vie for potential acquisition by Cloudera as its product develops further. Cloudera, on the other hand, stands to gain from Cask’s progress in building a platform for facilitating Big Data application development that runs natively within a Hadoop infrastructure. By aligning its product roadmap with Cask, Cloudera adds yet another feather to its cap vis-à-vis tools and platforms within its ecosystem that enhance and accelerate the experience of Hadoop adoption. Overall, the partnership strengthens Cloudera’s case for going public by illustrating the astuteness and breadth of its vision when it comes to strategic partners and collaborators such as Cask, not to mention the business and technological benefits of the partnership. Expect Cloudera to continue aggressively building out its partner ecosystem as it hurtles toward an IPO that it may well be already preparing, at least as reported in Venturebeat.