Take a Deep Dive into Apache Spark

Henk Griffioen – Data Science with Spark Trainer

Henk Griffioen teaches the course Data Science with Spark and sees Apache Spark as a fantastic tool for both data scientists and data analysts.

His use of Apache Spark at one of GoDataDriven’s customers illustrates the scaling possibilities it offers. At Quby, a smart energy company, Henk worked on a project to help customers save energy. He describes the experience, “We first built an application for 300 customers and later scaled it to tens of thousands of users, which is a piece of cake for Spark.”

The course covers a range of datasets and applications. Participants use Apache Spark with Python to analyze flight information, wrangle churn data, and build an NLP model. The training starts with in-depth information on Apache Spark’s inner works and ends with a hackathon. “During this hackathon, everyone works on a case at their own pace,” Henk explains. “Analyses, machine learning, streaming, it is all covered during the course.”


Henk’s ambition is that all attendees learn Spark techniques and see how they can directly apply Spark to their work. “Everyone picks up what’s important for him or her. From analysts learning to work on platforms like DataBricks to Data Scientists who want to use their models on a large scale. I have even had attendees asking how to connect their deep learning models to Apache Spark!”

Interested in the Data Science with Spark training?

Head over to Xebia Academy to check out the full program and to save yourself a spot.

Check out Data Science with Spark