• 16: Infini-Attention: Google's Solution for Infinite Memory in LLMs

  • May 22 2024
  • Length: 23 mins
  • Podcast

16: Infini-Attention: Google's Solution for Infinite Memory in LLMs  By  cover art

16: Infini-Attention: Google's Solution for Infinite Memory in LLMs

  • Summary

  • In this episode of the AI Paper Club Podcast, hosts Rafael Herrera and Sonia Marques welcome Leticia Fernandes, a Deeper Insights Senior Data Scientist and Generative AI Ambassador. Together, they explore the groundbreaking "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" paper from Google. This paper addresses the challenge of fitting infinite context into large language models, introducing the Infini-attention method. The trio discusses how this approach works, including how it uses linear attention and employs compressive memory to store key-value pairs, enabling models to handle extensive contexts.

    We also extend a special thank you to the research team Google for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2404.07143.pdf

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at thepaperclub@deeperinsights.com.


    Show more Show less

What listeners say about 16: Infini-Attention: Google's Solution for Infinite Memory in LLMs

Average customer ratings

Reviews - Please select the tabs below to change the source of reviews.