Big Data

DataTorrent Enhances Platform For Real-Time Analytics On Streaming Big Data

DataTorrent recently announced the availability of DataTorrent Real-Time Streaming (RTS) 2.0, which builds on its June release of the 1.0 version of by providing enhanced capabilities to run real-time analytics on streaming Big data sets. DataTorrent RTS 2.0 boasts the ability to ingest data from “any source, any scale and any location” by means of over 75 connectors that allow the platform to ingest varieties of structured and unstructured data. In addition, this release delivers over 450 Java operators that allow data scientists to perform queries and advanced analytics on Big datasets including predictive analytics, statistical analysis and pattern recognition. In a phone interview with John Fanelli, DataTorrent’s VP of Marketing, Cloud Computing Today learned that the platform has begun work on a Private Beta of a product, codenamed Project DaVinci, to streamline the design of applications via a visual interface that allows data scientists to graphically select data sources, analytic operators and their inter-relationship as depicted below:

As the graphic illustrates, DataTorrent Project DaVinci (Private Beta) delivers a unique visual interface for the design of applications that leverage Hadoop-based datasets. Data scientists can take advantage of DataTorrent’s 450+ Java operators and the platform’s advanced analytics functionality to create and debug applications that utilize distributed datasets and streaming Big data. Meanwhile, DataTorrent RTS 2.0 also boasts the ability to store massive amounts of data in a “HDFS based distributed hash table” that facilitates rapid lookups of data for analytic purposes. With version 2.0, DataTorrent continues to disrupt the real-time, Big data analytics space by delivering a platform capable of ingesting data at any scale and running real-time analytics in the broader context of a seductive visual interface for creating Big data analytics applications. DataTorrent competes in the hotly contested real-time Big data analytics space alongside technologies such as Apache Spark, but delivers a range of functionality that supersedes Spark Streaming as illustrated by its application design, advanced analytics and flexible data ingestion capabilities.

Categories: Big Data, DataTorrent, Hadoop

ParStream Reveals Big Data Analytics Platform For Internet Of Things

Today, ParStream reveals details of an analytics platform dedicated to the internet of things.
Featuring ParStream DB, a patented database designed to handle the storage of massive amounts of data and real-time analytics, the ParStream platform boasts sub-second query response times, the ability to analyze billions of rows of data and time series analytics that analyze incoming, real-time data feeds in conjunction with historical data. Moreover, ParStream’s Geo-Distributed analytics server enables data analytics and data visualization on distributed datasets in ways that accommodate the disparate data storage infrastructures of contemporary global IT organizations. ParStream’s bevy of analytic offerings sit atop the ParStream DB as illustrated below:

As the graphic illustrates, ParStream’s internet of things analytics platform delivers a diverse range of analytic engines that operate upon massive volumes of data resulting from appliances, wearable devices, automobiles and sensors. Moreover, the platform boasts data acquisition qua ETL/streaming technologies and data visualization technology that collectively delivers a turnkey solution for internet of things use cases. As told to Cloud Computing Today in an interview with ParStream’s CEO, Peter Jensen, the ParStream solution absolves customers of the necessity of cobbling together disparate technologies that often struggle to integrate with one another by delivering instead a holistic solution for internet of things analytics that has few counterparts in the industry to date. The conjunction of the platform’s database and advanced analytics technologies renders it especially well suited for the massive, high velocity datasets specific to the internet of things space, particularly given its real-time analytic functionality on streaming datasets and capability to change the direction of its analytics as data evolves. More generally, the platform illustrates the Big Data industry’s increasing preoccupation with real time analytics on massive, streaming datasets and the concomitant challenges associated with storing incoming data and the results of analytic operations on the data in question.

Categories: Big Data | Tags: , ,

Teradata and Cloudera Partner To Optimize Integration Of Big Data Platforms In Teradata Unified Architecture

Teradata today announced a partnership with enterprise Hadoop vendor Cloudera marked by the optimization of the integration between Teradata’s integrated data warehouse and Cloudera’s enterprise data hub. The collaboration between Teradata and Cloudera streamlines access to multiple data sources by means of the Teradata Unified Data Architecture (UDA). As a result of the integration, the Teradata Unified Data Architecture can access data from the Cloudera enterprise data hub by way of a unified Big Data infrastructure that has the capacity to perform data operations and analytics on massive, heterogeneous datasets featuring structured and unstructured data. As part of today’s announcement, Teradata also revealed details of Cloudera-certified connectors that can integrate with Apache Hadoop. Other components of the UDA that interface with Cloudera’s enterprise data hub include the Teradata QueryGrid, which allows users to pose analytical questions of data in both Teradata’s integrated data warehouse and the Cloudera enterprise data hub, in addition to the Teradata Loom, which enables tracking, exploration, cleansing and transformation of Hadoop files. Today’s announcement of the integration between Teradata’s integrated data warehouse and Cloudera’s enterprise data hub signals an important development in the Big Data space insofar because the alignment of the product roadmaps of the two vendors promises to position Teradata strongly via-a-via the development of Big data analytics and processing functionality. On Cloudera’s side, the partnership renders its enterprise data hub even more compatible with one of the industry’s most respected Big Data analytic platforms and prefigures the inking of even more partnerships between Hadoop and Big Data management vendors as a means of continuing to foster deeper hardware and software integration in the Hadoop management space.

Categories: Big Data, Cloudera, Teradata

Treasure Data Partners With Yahoo Japan To Promote Its Cloud-based Big Data Processing And Analytics Platform

Today, Treasure Data announces a partnership with Yahoo! JAPAN whereby Yahoo! JAPAN will resell the Treasure Data platform to customers interested in leveraging the platform’s Big Data capture, processing and analytics capabilities. Branded the Yahoo! JAPAN Big Data Insight, the collaboration between Treasure Data and Yahoo! JAPAN will allow organizations to store and run analytics on massive amounts of real-time data without managing the relevant hardware infrastructure or mastering the intricacies of MapReduce. The Treasure Data platform embodies the intersection between cloud computing and Big Data given that customers have the opportunity to take advantage of Treasure Data’s cloud for storing Big Data as illustrated below:

The graphic above illustrates the Treasure Data platform’s ability to collect, store and run real-time analytics on massive amounts of cloud-based data. Worth noting about the Treasure Data platform is that although the platform specializes in Big Data processing and analytics, data is not stored within the HDFS Hadoop data format. Instead of HDFS, the Treasure Data platform stores data as Plazma, its “own distributed columnar storage system” that boasts attributes such as scalability, efficiency, elasticity and a schema-less architecture. Plazma’s columnar storage structure means that queries can focus on swathes of data in contrast to the entire dataset, thereby enabling faster queries, more effective use of the platform’s schema-less data model and superior performance all around. Plazma is achieved by transforming row-based JSON data into a columnar format that optimizes storage and the processing of analytical queries. Treasure Data’s resulting analytical platform features use cases such as web-based data from software applications and mobile applications in addition to data from the internet of things such as appliances and wearable devices. Today’s announcement represents a huge coup for Treasure Data because of the co-branding of its technology alongside Yahoo, one of the industry’s experts in the storage, processing and analysis of Big Data. Moreover, the collaboration with Yahoo promises to strengthen Treasure Data’s market presence in Japan and potentially pave the way for greater market expansion into Asia and the Pacific Rim, more generally.

Categories: Big Data, Treasure Data, Yahoo

Informatica Big Data Edition Comes Pre-Installed On Cloudera QuickStart VM And Hortonworks Sandbox

Earlier this month, Informatica announced 60 day free trials of Informatica Big Data Edition for Cloudera QuickStart VM and the Hortonworks Sandbox. The 60 day trial means that the Informatica Big Data Edition will be pre-installed in the sandbox environments of two of the leading Hadoop distributions in the Big Data marketplace today. Developers using the Cloudera QuickStart VM and Hortwonworks Sandbox now have streamlined access to Informatica’s renowned big data cleansing, data integration, master data management and data visualization tools. The code-free, graphical user interface-based Informatica Big Data Edition allows customers to create ETL and data integration workflows as well as take advantage of the hundreds of pre-installed parsers, transformations, connectors and data quality rules for Hadoop data processing and analytics. The Informatica Big Data platform specializes in Hadoop profiling, parsing, cleansing, loading, enrichment, transformation, integration, analysis and visualization and reportedly improves developer productivity five-fold by means of its automation and visual interface built on the Vibe virtual data machine.

Although the Informatica Big Data Edition supports MapR and Pivotal Hadoop distributions, the free 60 day trial is currently available only for Cloudera and Hortonworks. Informatica’s success in seeding its Big Data Edition with Cloudera and Hortonworks increases the likelihood that developers will explore and subsequently use its Big Data Edition platform as a means of discovering and manipulating Big Data sets. As such, Informatica’s Big Data Edition competes with products like Trifacta that similarly facilitate the manipulation, cleansing and visualization of Big Data by means of a code free user interface that increases analyst productivity and accelerates the derivation of actionable business intelligence. On one hand, the recent proliferation of Big Data products that allow users to explore Big Data without learning the intricacies of MapReduce democratizes access to Hadoop–based datasets. That said, the ability of graphical user interface-driven Big Data discovery and manipulation platforms to enable the granular identification of data anomalies, exceptions and eccentricities that may otherwise become obscured by large-scale trend analysis remains to be seen.

Categories: Big Data, Hadoop, Informatica | Tags:

Base Enhances Sales Productivity Platform With Real-Time Analytics And Rich Data Visualization

Base, the CRM that leverages real-time data and analytics, recently announced the release of a bevy of new features and functionality that brings real-time, Big Data analytics to cloud-based sales productivity management. Base’s proprietary technology aggregates data from sources such as phone calls, in person meetings, social network-based prospects and news feeds and subsequently produces real-time notifications to sales professionals. As a result, sales teams can minimize their manual input of sales-related data and instead take advantage of the analytic and data visualization capabilities of the Base platform. The Base platform testifies to a qualitative shift within the CRM space marked by the delivery of enhanced automation to sales operations workflows resulting from the conjunction of real-time data, predictive analytics and data visualization. Uzi Shmilovici, CEO of Base, remarked on the positioning of Base within the larger CRM landscape as follows:

Base picks up where other CRMs have left off. Until now, legacy cloud Sales and CRM products like Salesforce have been accepted as ‘the norm’ by the enterprise market. However, recent advancements in big data, mobility and real-time computing reveal a need for a new generation of intelligent sales software that offers flexibility, visibility, and real-time functionality. If you’re using outdated technology that cannot adapt to the advanced needs of modern day sales teams, your competition will crush you.

Here, Shmilovici comments on the way in which big data, real-time analytics and the proliferation of mobile devices have precipitated the creation of a new class of sales applications that outstrip the functionality of “legacy cloud Sales and CRM products like Salesforce.” In a phone interview with Cloud Computing Today, Shmilovici elaborated on the ability of the Base platform to aggregate disparate data sources to produce rich, multivalent profiles of sales prospects that augment the ability of sales teams to convert leads into qualified sales. Base’s ability to enhance sales operations by means of data-driven analytics are illustrated by the screenshot below:

The graphic above illustrates the platform’s ability to track sales conversions at the level of individual sales professionals as well as sales managers or owners within a team. VPs of Sales can customize analytics regarding the progress of their teams to enable enhanced talent and performance management in addition to gaining greater visibility as to where the market poses its stiffest challenges. More importantly, however, Base delivers a veritable library of customized analytics that illustrates a prominent use case for the convergence of cloud computing, real-time analytics and Big Data technologies. As such, the success of the platform will depend on its ability to continue enhancing its algorithms and analytics while concurrently enriching the user experience that remains integral to the daily experience of sales teams.

Categories: Big Data, Miscellaneous | Tags: , , , ,

Teradata Acquires Hadoop Consulting And Strategy Services Firm Think Big Analytics

Teradata continued its spending spree by acquiring the Mountain View, CA-based Hadoop consulting firm Think Big Analytics on Wednesday. The acquisition of Think Big Analytics will supplement Teradata’s own consulting practice. Think Big Data Analytics, which has roughly 100 employees, specializes in agile SDLC methodologies for Hadoop consulting engagements that typically last more than a month but less than a quarter of a year. According to Teradata Vice President of Product and Services Marketing Chris Twogood, Teradata has “now worked on enough projects that it’s been able to build reusable assets” as reported in PCWorld. Think Big Analytics will retain its branding and its management team will remain at the company’s Mountain View office. Teradata’s acquisition of Think Big Analytics comes roughly two months after its purchase of Revelytix and Hadapt. Revelytix provides a management framework for metadata on Hadoop whereas Hadapt’s technology empowers SQL developers to manipulate and analyze Hadoop-based data. Teradata’s third Big Data acquisition in less than two months comes at a moment when the Big Data space is exploding with a proliferation of vendors that differentially tackle the problem of data discovery, exploration, analysis and visualization with respect to Hadoop-based data. The question now is whether the industry will experience early market consolidation as evinced by startups snapped up by larger vendors or whether the innovation that startups provide will be able to survive a land grab in the Big Data space initiated by larger, well capitalized companies seeking to complement their Big Data portfolio with newly minted Big Data products and technologies. Terms of Teradata’s acquisition of Think Big Analytics were not disclosed.

Categories: Big Data, Hadoop, Teradata | Tags:

Blog at WordPress.com. The Adventure Journal Theme.