Skip to main content

Upcoming llm-d Events

Stay connected with the llm-d community at meetups, conferences, and workshops. All meetings are open to the public unless noted otherwise.

March 2026

llm-d Distributed Inference Meetup NYC

📍 IBM Innovation Studio, 1 Madison Ave, NYC

March 11, 2026 · Free

Sessions

Intro to llm-d for Open Source Distributed Inference & Project Update
Wed, Mar 11, 2026 · 5:15 PM ET · 📍 IBM Innovation Studio
View details
Distributed LLM Serving on AMD with llm-d
Wed, Mar 11, 2026 · 5:35 PM ET · 📍 IBM Innovation Studio
View details
The Path to Intelligent Routing: Lessons Learned Scaling Wide-EP and MoE Models
Wed, Mar 11, 2026 · 5:55 PM ET · 📍 IBM Innovation Studio
View details
KV-Cache Wins You Can See: Prefix-Cache Scheduling, Offloading, and Scaling
Wed, Mar 11, 2026 · 6:15 PM ET · 📍 IBM Innovation Studio
View details

KubeCon + CloudNativeCon Europe 2026

📍 Amsterdam, The Netherlands

March 23–26, 2026 · Paid

Sessions

Panel: Routing Intelligence Vs Traffic Control: Architectural Tradeoffs for AI Inference in Gateway API
Mon, Mar 23, 2026 · 12:45 – 13:20 CET · 📍 Hall 7 | Room B
View details
Cloud Native Theater | Istio Day: Running State of the Art Inference with Istio and llm-d
Tue, Mar 24, 2026 · 16:00 – 16:35 CET · 📍 Halls 1-5
View details
Route, Serve, Adapt, Repeat: Adaptive Routing for AI Inference Workloads in Kubernetes
Wed, Mar 25, 2026 · 11:45 – 12:15 CET · 📍 Auditorium
View details
Tutorial: KV-Cache Wins You Can Feel: Building AI-Aware LLM Routing on Kubernetes
Thu, Mar 26, 2026 · 11:00 – 12:15 CET · 📍 Elicium 1
View details
Evolving KServe: The Unified Model Inference Platform for Both Predictive and Generative AI
Thu, Mar 26, 2026 · 11:00 – 11:30 CET · 📍 E103-105
View details

April 2026

PyTorch Conference Europe 2026

📍 Paris, France

April 7–8, 2026 · Paid

Sessions

Why WideEP Inference Needs Data-Parallel-Aware Scheduling
Tue, Apr 7, 2026 · 13:35 – 14:00 CEST · 📍 Central Room
View details
The Token Slice: Implementing Preemptive Scheduling Via Chunked Decoding
Tue, Apr 7, 2026 · 14:05 – 14:30 CEST · 📍 Central Room
View details
Lightning Talk: Beyond Generic Spans: Distributed Tracing for Actionable LLM Observability
Tue, Apr 7, 2026 · 15:45 – 15:55 CEST · 📍 Master Stage
View details
Birds of A Feather: Disaggregated Tokenization: Building Toward Tokens-In-Tokens-Out LLM Inference
Wed, Apr 8, 2026 · 10:10 – 10:35 CEST · 📍 TBA
View details
Lightning Talk: KV-Cache Centric Inference: Building a State-Aware Serving Platform With llm-d and vLLM
Wed, Apr 8, 2026 · 11:10 – 11:20 CEST · 📍 Founders Cafe
View details
Lightning Talk: Not All Tokens Are Equal: Semantic KV-Cache for Agentic LLM Serving
Wed, Apr 8, 2026 · 11:25 – 11:35 CEST · 📍 Founders Cafe
View details
Lightning Talk: Inside vLLM's KV Offloading Connector: Async Memory Transfers for Higher Inference Throughput
Wed, Apr 8, 2026 · 14:20 – 14:30 CEST · 📍 Central Room
View details