Demuxed Talk - Realtime Video AI with Diffusion Models

Interested in learning more about the technology that underpins Daydream's realtime video AI? Check out this talk that Rafał Leszko gave at the Demuxed conference on how diffusion models work.

Talk synopsis:

Generative AI is opening new possibilities for creating and transforming video in real time.

In this talk, I’ll explore how recent models such as StreamDiffusion and LongLive push diffusion techniques into practical use for low-latency video generation and transformation.

I’ll give a deep technical walkthrough of how these systems can be adapted for streaming use cases, unpacking the full pipeline - from decoding, through the diffusion process, to encoding - and highlighting optimisation strategies, such as KV caching, that make interactive generation possible.

I’ll also discuss the tradeoffs between ultra-low latency video transformation and generating longer, more coherent streams. To make it concrete, I’ll present demos of StreamDiffusion (served with the open-source cloud service Daydream) and LongLive (explored with the open-source research tool Scope), showcasing practical examples of both video-to-video transformation and streaming text-to-video generation.

Attachments

Demuxed Talk - Realtime Video AI with Diffusion Models

Demuxed Talk - Realtime Video AI with Diffusion Models

Explore new worlds with Daydream Scope

Tags