The Markovian Thinker Podcast Por  arte de portada

The Markovian Thinker

The Markovian Thinker

Escúchala gratis

Ver detalles del espectáculo

Obtén 3 meses por US$0.99 al mes

In this episode, we discuss The Markovian Thinker by Milad Aghajohari, Kamran Chitsaz, Amirhossein Kazemnejad, Sarath Chandar, Alessandro Sordoni, Aaron Courville, Siva Reddy. The paper proposes Markovian Thinking, a reinforcement learning paradigm that limits reasoning context to a constant-size state, enabling linear compute with constant memory rather than quadratic overhead. They implement this approach in Delethink, an environment that segments reasoning into fixed-size chunks with learned textual states to seamlessly continue reasoning after resets. Experiments show Delethink-trained models achieve longer reasoning chains more efficiently and scale better than standard methods, significantly reducing computational costs.
Todavía no hay opiniones