3:00 PM | Opening Remarks | Join us as Chase Christensen, one of the Chairs of the Kubeflow Outreach Committee, opens the day with a warm welcome from the Kubeflow community. He’ll introduce Kubeflow, how to get involved, where to share your ideas, and how to connect with the community. We’ll also walk through the day’s schedule and highlight the exciting talks and discussions ahead. |
3:05 PM | Bringing Kubeflow Training Local: SDK-Driven “Local-exec” Mode | Kubeflow’s Python SDK makes it easy to define and submit training jobs on remote clusters—but developing, debugging, and iterating on your training code still often requires a full Kubernetes round-trip. In this session you’ll learn how the SDK’s new local_exec execution mode lets you run your training job locally on your machine before submitting it to kubernetes. |
3:20 PM | Transition Time | Take five to grab a snack, refill your coffee, or send that quick email while our next speaker gets set up. |
3:25 PM | Inferencing LLMs in production with Kubernetes and KubeFlow | Large Language Models (LLMs) are powerful but deploying them reliably, cost-effectively, and at scale in production is a different challenge altogether. In this session, we’ll walk through how to operationalize LLM inference using Kubeflow on Kubernetes, leveraging open-source and cloud-native tools to build resilient, scalable, and observable GenAI infrastructure. |
3:40 PM | Transition Time | Take a few minutes to stretch, refuel, or catch up on messages while we get ready for the next session. |
3:45 PM | Streamline LLM Fine-Tuning on Kubernetes with Kubeflow LLM Trainer | Fine-tuning LLMs on Kubernetes is challenging for data scientists due to the complex Kubernetes configurations, diverse fine-tuning techniques, and different distributed strategies like data and model-parallelism.
It’s crucial to hide the complex infrastructure configurations from users, and allow them to gracefully shift among diverse models, datasets, fine-tuning techniques and distributed strategies.
This talk will introduce Kubeflow LLM Trainer, a tool that leverages pre-configured blueprints and flexible configuration overrides to streamline the LLM fine-tuning lifecycle on Kubernetes.
Shao Wang ( Kubeflow WG Training/AutoML) will demonstrate how Kubeflow LLM Trainer integrates with multiple fine-tuning techniques and distributed strategies, while offering a simple yet flexible Python API.
Attendees will see how LLMs can be fine-tuned on Kubernetes with just a single line of code, highlighting how the Kubeflow LLM Trainer streamlines, simplifies, and scales LLM fine-tuning on Kubernetes. |
4:15 PM | Break | We’re taking a 15-minute break! Use this time to grab a bite, take a walk, or recharge before we jump back into the next session. |
4:30 PM | Kubeflow for enabling AI Powered Drug Discovery and development in AstraZeneca | AstraZeneca’s robust AI platform, Azimuth—their first enterprise cloud-native machine learning platform—relies heavily on Kubeflow to power scalable and efficient AI workflows. In this session, we’ll explore how Kubeflow supports diverse AI use cases, with each project operating in its own dedicated namespace and persistent volumes ensuring durable data storage.
We’ll cover how to enable cross-namespace volume access, build custom Kubeflow notebook images with VS Code and other editors, and use self-hosted GitHub runners to trigger pipelines. You’ll also see how integrations with tools like Grafana, ArgoCD, and Argo Workflows enhance the platform’s functionality.
To maintain security and compliance, custom image governance is enforced through Kyverno policies. Finally, we’ll introduce the GreenOps framework—a set of practices focused on building sustainable AI solutions.
Join us for an in-depth look at how Kubeflow powers enterprise-scale AI at AstraZeneca. |
5:00 PM | Transition Time | We’ll get started in just a few minutes—take five to stretch, top off your drink, or get settled before the next session begins. |
5:05 PM | Spark Operator - Feature Engineering with Spark on Kubeflow | Real-world ML rarely deals with clean tables—more often, it involves messy inputs like PDFs, scanned documents, images, ZIP files, and data from enterprise warehouses.
In this session, we’ll explore how to transform that diverse data into model-ready features using Apache Spark with the Kubeflow Spark Operator, all orchestrated through Kubeflow Pipelines.
We’ll walk through how this approach bridges a previous gap in Kubeflow: extracting actionable insights from massive volumes of raw data—hundreds of terabytes—using fully open-source tools and technologies.
Target Audience: Data and ML engineers with basic Spark or Kubernetes experience. |
5:35 PM | Transition Time | We’re taking five! Step away for a moment while we prepare for the upcoming session. |
5:40 PM | Simplifying Generative AI Model Training on Kubernetes using Helm Charts | Training generative AI models on Kubernetes offers a wide range of frameworks, tools, and orchestration options. While this diversity fuels innovation, it also introduces significant complexity.
In this talk, we present a Helm-based approach that simplifies AI model training using Kubeflow Training Operators. This method abstracts much of the underlying complexity while preserving flexibility in choosing training technologies.
Our solution is accelerator-agnostic and provides a consistent YAML interface across various training frameworks. We’ll also introduce a new Kubeflow Pipeline component that enables the construction of complex, end-to-end training workflows using Helm charts.
Through real-world examples, we’ll showcase training pipelines using Accelerate, Ray Train + Lightning, and NVIDIA’s NeMo-Megatron libraries. We’ll also demonstrate automatic scaling of accelerator infrastructure using Karpenter. |
6:10 PM | Closing Remarks | Join Valentina Rodriguez Sosa, one of the Chairs of the Kubeflow Outreach Committee and Principal Architect at Red Hat, as she closes out the day and shares a heartfelt farewell—for now. She’ll highlight upcoming events, calls to action, and ways you can stay involved in the Kubeflow community. |