Arxiv paper - Long-Context State-Space Video World Models Podcast Por  arte de portada

Arxiv paper - Long-Context State-Space Video World Models

Arxiv paper - Long-Context State-Space Video World Models

Escúchala gratis

Ver detalles del espectáculo

Acerca de esta escucha

In this episode, we discuss Long-Context State-Space Video World Models by Ryan Po, Yotam Nitzan, Richard Zhang, Berlin Chen, Tri Dao, Eli Shechtman, Gordon Wetzstein, Xun Huang. The paper introduces a novel video diffusion model architecture that uses state-space models (SSMs) to extend temporal memory efficiently for causal sequence modeling. It employs a block-wise SSM scanning scheme combined with dense local attention to balance long-term memory with spatial coherence. Experiments on Memory Maze and Minecraft datasets show the method outperforms baselines in long-range memory retention while maintaining fast inference suitable for real-time use.
Todavía no hay opiniones