DoK Talks #138 - Build your own social media analytics with Apache Kafka

Data on Kubernetes
Thu, Jun 23, 2022, 9:00 AM (PDT)

Jakub Scholz - Senior Principal Software Engineer , Red Hat

About this event


Apache Kafka is more than just a messaging broker. It has a rich ecosystem of different components. There are connectors for importing and exporting data, different stream processing libraries, schema registries and a lot more.

The first part of this talk will explain the Apache Kafka ecosystem and how the different components can be used to load data from social networks and use stream processing and machine learning to analyze them.

The second part will show a demo running on Kubernetes which will use Kafka Connect to load data from Twitter and analyze them using the Kafka Streams API.

After this talk, the attendees should be able to better understand the full advantages of the Apache Kafka ecosystem especially with focus on Kafka Connect and Kafka Streams API. And they should be also able to use these components on top of Kubernetes.


The key takeaway of this talk is that Apache Kafka is more than just a messaging broker. It is a platform and ecosystem of different components which can be used to solve complex tasks when dealing with events or processing data. The talk demonstrates this on loading tweets from Twitter and processing them using the different parts of the Kafka ecosystem. The whole talk and its demos are running on Kubernetes using the Strimzi project. So it also shows how to easily run all the different components on top of Kubernetes with the help of few simple YAML files.