Data on Kubernetes C. #42 Spark on Kubernetes is Now Generally Available: Why & How to Migrate to It

Data on Kubernetes
Tue, Apr 20, 9:00 AM (PDT)

About this event

Apache Spark natively runs on top of Kubernetes (instead of Hadoop YARN) since 2018, but it's only since Spark 3.1 (released in March 2021) that the integration is now officially generally available & production-ready. What is the high-level architecture of Spark on Kubernetes, how does it compare to alternatives, what does the migration look like? These are some of the questions we will answer together. We will first introduce the core concepts, then go through the stories of customers who migrated, and then give you concrete technical tips to help you be successful with Spark (on Kubernetes). If time permits, I may do a risky live demo. This will be a technical talk with very fresh content - I hope you will like it. I plan to make it short enough to make room for Q&A and improvisations based on your request. So let me know if there's something specific you're interested in.

Speaker

  • Jean-Yves Stephan

    Jean-Yves Stephan

    Data Mechanics

    Co-Founder & CEO

    I'm one of the co-founders at Data Mechanics (https://www.datamechanics.co), a Cloud-Native Spark Platform for Data Engineers. We're a YCombinator backed startup. We strive to finally make Apache Spark as developer friendly and cost-effective as it should be.. by automating the infrastructure management side (autoscaling, automated sizing of containers, autotuning of Spark configurations) and b...

    Read More

  • Organizers

  • Ihor Dvoretskyi

    Ihor Dvoretskyi

    Cloud Native Computing Foundation

    Organizer

    View Profile
  • Bart Farrell

    Bart Farrell

    Data on Kubernetes Community

    Organizer

    View Profile
  • Melissa Logan

    Melissa Logan

    Organizer

    View Profile
  • Diogenese Topper

    Diogenese Topper

    Organizer

    View Profile
  • Iker Arce

    Iker Arce

    Organizer

    View Profile