Data on Kubernetes C. #42 Spark on Kubernetes is Now Generally Available: Why & How to Migrate to It

Name: Data on Kubernetes C. #42 Spark on Kubernetes is Now Generally Available: Why & How to Migrate to It
Start: 2021-04-20T09:00:00-07:00
End: 2021-04-20T10:00:00-07:00

Data on Kubernetes

Apr 20, 2021, 4:00 – 5:00 PM

Virtual event

About this event

Apache Spark natively runs on top of Kubernetes (instead of Hadoop YARN) since 2018, but it's only since Spark 3.1 (released in March 2021) that the integration is now officially generally available & production-ready. What is the high-level architecture of Spark on Kubernetes, how does it compare to alternatives, what does the migration look like? These are some of the questions we will answer together. We will first introduce the core concepts, then go through the stories of customers who migrated, and then give you concrete technical tips to help you be successful with Spark (on Kubernetes). If time permits, I may do a risky live demo. This will be a technical talk with very fresh content - I hope you will like it. I plan to make it short enough to make room for Q&A and improvisations based on your request. So let me know if there's something specific you're interested in.