- This event has passed.
VLLM: An Efficient inference and serving engine for LLMs

Pisa.dev is back with a new Event and a new location!
Join us on February 9th at 18.30 at Detaills (Via San Martino 3, Pisa) for our meetup.
The speaker will be Nicolò Lucchesi, vLLM Maintainer & Senior ML Engineer @ Red Hat AI
Talk: vLLM: an efficient inference and serving engine for LLMs
Abstract: This talk is a quick tour of vLLM, an open-source LLM inference and serving engine focused on speed and practical production use. We’ll look at how core ideas like Paged Attention and Continuous Batching solve common inference pain points, and then touch on newer work around distributed inference, parallelism, and token efficiency at scale.
Expect architecture, practical takeaways, and a peek at what’s coming next.
As always, we’ll wrap up the evening with pizza and beer at a local pizzeria.
Get your free ticket: https://www.eventbrite.it/e/biglietti-pisadev-february-2026-meetup-1982302729827




