Talks
Events

Harnessing the Power of Spark & Cassandra within your Spring App

Steve Pember at Spring I/O 2017

For many companies, the data they collect and the dataset they build is their most valuable asset. As these datasets grow it becomes increasingly important that these organizations analyze and find meaning in their data, and typically they’ll reach for well-known tools like Hadoop. However, over the past few years a new generation of data analysis tools have become available... most notably Apache Spark.

Spark is a cluster-computing framework that allows users to perform calculations against resilient in-memory datasets, distributed across multiple machines, using a functional programming interface. It has won world records for quickly processing large data sets and is currently one of the Apache Foundation’s most active projects. Spark supports a variety of technologies that can be used as its persistence mechanism; one of the most interesting is Apache Cassandra. Cassandra is a linearly scalable, fault tolerant, decentralized datastore that is useful if you need highly available and scalable storage. It is used heavily by some of the largest firms in tech, like Apple, Facebook, and Netflix. These two technologies are complicated, but integrate well and provide such a level of utility that whole companies have formed around these two technologies offering consultancy and development services.

In this talk we’ll learn how Spark and Cassandra can be leveraged within your Spring Application, and how to get started with the Spring XD and Data integrations. We’ll talk about Spark and Cassandra from a high level and walk through code examples showing what it’s like to programmatically work with them. We’ll discuss some of the pitfalls you will run into when working with these technologies - like modeling your data appropriately to ensure even distribution in Cassandra and general packaging woes with Spark - and ways to avoid them. Finally, we’ll explore how we at ThirdChannel are using these technologies in the real world.