GenAI Applications: Development to Production

Cloud Native Silicon Valley

Jul 11, 12:30 – 3:00 AM (UTC)

In-person event

Login to RSVP

About this event

Details

Looking for a fun tech evening after the holiday weekend? Join Ray and our south bay friends hosting this tech meetup at Nutanix.

MUST REGISTER HERE !!!

Details:

Join us for an opportunity to network with AI applications developers in the Bay Area! Hear insights from speakers representing Nutanix and PingCAP as they discuss the latest advancements in building and deploying scalable GenAI solutions.

Agenda

  • ​​​5:30 - 6:00 pm: Check-in/networking (food & drinks)

  • ​6:00 - 6:30 pm: Talk #1 - Lessons learned from managing GPU deployments on Kubernetes

  • ​​6:30 - 7:00 pm: Talk #2 - Ship a Lightning-Fast FAQ & Stop Keyword Fails

  • ​​7:00 - 7:30: Talk #3 - Setting the BAR: Balancing Budget, Authenticity & Reasoning in GenAI

  • ​7:30 - 8:00: Wrap-up/networking


Talk #1: Lessons learned from managing GPU deployments on Kubernetes

​Speakers: Sonali Mishra & Shalin Patel (Nutanix)
Abstract: Handling workloads that require GPU on Kubernetes has become easier with tools like NVIDIA’s GPU Operator but deploying them in production across real-world environments brings hidden challenges. In this session, we will share lessons learned from managing GPU across different infrastructures including on-prem and across public clouds, air-gapped environments, and different Operating Systems. We will cover compatibility issues with drivers and runtimes, discovering GPU attached nodes and scheduling GPU workloads on Kubernetes clusters. We will also discuss our experience working with vGPUs, the challenges of enabling multi-tenancy, dynamic resource allocation and monitoring. And lastly, we will talk about how we addressed some of these challenges using the NVIDIA GPU operator and our Kubernetes operator for vGPU tokens and license management. This talk is ideal for platform engineers and architects bringing AI/ML to Kubernetes, and looking to scale GPU use efficiently, securely and with better observability.

Talk #2 - Ship a Lightning-Fast FAQ & Stop Keyword Fails

Speaker: Chris Dabatos (PingCAP)
Abstract: In this session, learn how anyone can build & ship a lightning-fast FAQ with TiDB's built-in vector search with AWS Bedrock and a few plain English Python scripts. There's no need for developers to deal with microservices or cluster baby sitting, and since TiDB is MySQL compatible, your existing "ORM" will just work.

Talk #3 - Setting the BAR: Balancing Budget, Authenticity & Reasoning in GenAI

​Speaker: Jinan Zhou (Nutanix)
Abstract: There is no perfect GenAI cocktail—every system has to mix Budget, Authenticity, and Reasoning. Raise the BAR on two, and you’ll have to water down the third. ​In this session, we’ll introduce the BAR Triangle, a framework for understanding why it’s impossible to fully optimize every dimension of a GenAI system at once. Through practical examples and case studies, you’ll learn how to map your own system onto the triangle, identify the hidden costs of maximizing certain aspects, and develop strategies for choosing the right BAR for your product. Whether you’re building enterprise AI, consumer chatbots, or mission-critical GenAI, this talk will help you make smarter, more transparent tradeoffs—and “set the BAR” that matters most for your goals.

PER BUILDING REQUIREMENTS - MUST REGISTER HERE !!!

When

When

Friday, July 11, 2025
12:30 AM – 3:00 AM (UTC)

Organizers

  • Lisa-Marie Namphy

    Director, Developer Relations

  • John Starmer

    Kumulus Technologies

    Lead Organizer

CONTACT US