Training scheduleJoin waiting list
Streaming Architecture at Scale
While some apps enjoy the luxury of processing in a batch oriented fashion, others, as in the IoT ecosystem, expect events to be ingested and processed as they occur. This training focuses on two key players on the streaming-side of data processing: Apache Kafka and Apache Spark!
Clients we've helped
What you'll learn
- Fundamentals of queue messaging systems
- Fundamentals of the Kafka architecture
- Fundamentals of Spark Streaming, with concept as checkpointing, watermarking, streaming windows and more
- How to consume and process events from Kafka with Spark
- How streaming topics work
- The basics of messaging systems
- The concept of topics
- Design considerations for a messaging system
- Run a Kafka cluster as docker-compose
- Set Spark as a consumer for Kafka
- Process incoming events real-time
Data Engineering Learning Journey
This online course is perfect for
IT engineers/architects, who deal with data stream processing architectures. Basic experience with Python and Apache Spark is required. If you’re not quite there yet, we recommend the Python for Data Engineers and Data Processing at Scale courses respectively as preparation for this training.
What will you learn during Streaming Architecture at Scale?
After this training, you will have understanding on how queue messaging systems work, how to route real-time incoming events with Apache Kafka and finally how to process them in real-time with Apache Spark.
Andrew SnareBig data hacker
Andrew is a Big Data Hacker at GoDataDriven. He is an experienced software engineer with a deep understanding of numerous technologies and languages.
Andrew is a certified Cloudera, Databricks, and Cassandra instructor, and also enjoys sharing his experiences on stage, for example at Goto Conference.
Structured, to-the-point, good combination of theory and practical examples, very knowledgeable trainer who can explain concepts very well
It was a hands-on and tangible course. We could apply what we learned in a matter of minutes. The trainer did a great job of answering ad-hoc questions that complemented the material. We appreciated the fact that we could apply what we were taught directly to our company.
I liked every aspect of this training and would like to thank the trainers. They did an excellent job of explaining how to use Spark for data science. This is the fourth GoDataDriven training I’ve followed. All were great, but this was the best one so far.
Climbing a steep Python and Machine Learning curve in three days. This would have taken me months on my own.