Redis Labs today announced the release of a Spark-Redis connector that accelerates the performance of Spark in comparison to other database infrastructures. The Spark-Redis connector allows Redis users to leverage the power of Spark to perform analytics on streaming data in real-time on large datasets. The open source Spark-Redis connector boasts the capability to read and write to Redis clusters while preserving Redis data structures. The integration of Redis and Spark results in Spark performance acceleration by a factor of 135 when compared to HDFS and a 45 fold acceleration when compared to Spark using Tachyon. Yiftach Shoolman, co-founder and CTO of Redis Labs, remarked on the significance of the acceleration of Spark on Redis data stores as follows:
Big data is coming of age and customers are demanding that big data insights are extracted in real-time. This is where Redis Labs fills the gap by delivering both the right performance and optimized distributed memory infrastructure to accelerate Spark. Our goal is to make Redis the de-facto data store for any Spark deployment.
Here, Shoolman comments on how the integration between Redis and Spark enhances the derivation of analytic insights from big datasets. The increase in Spark’s ability to perform on Redis allows users to conduct analyses in real-time while subsequently enjoying the performance the “optimized distributed memory infrastructure” that Redis delivers. In addition to benefits in speed, one of the key advantages of using Spark with Redis consists of the latter’s ability to allow Spark to access to individual data elements in ways that avoid the operational overhead associated with transferring or running analytics on large batches of data. Today’s announcement features news of a Spark-Redis connector, support for Spark SQL and the capability to use Redis as a distributed memory database for Spark. The Spark-Redis connector’s acceleration of Redis to blistering speeds promises to catapult the positioning of Redis within the NoSQL database landscape and the database infrastructure space more generally. By accelerating Spark to speeds over 100 times faster than its performance on HDFS, Redis gives customers faster access to data analytics on real-time in ways that can be crucial for use cases that demand split second analytic transactions on massive datasets. Going forward, Redis plans to collaborate with Spark to enable the use of Spark’s functionality for machine learning and graph database use cases as well. The graphic below illustrates the acceleration in speed enabled by the Spark-Redis connector as compared to other database infrastructures: