16: Infini-Attention: Google's Solution for Infinite Memory in LLMs
May 22 2024
Length: 23 mins
Podcast

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

16: Infini-Attention: Google's Solution for Infinite Memory in LLMs

Listen for free

View show details

Summary
In this episode of the AI Paper Club Podcast, hosts Rafael Herrera and Sonia Marques welcome Leticia Fernandes, a Deeper Insights Senior Data Scientist and Generative AI Ambassador. Together, they explore the groundbreaking "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" paper from Google. This paper addresses the challenge of fitting infinite context into large language models, introducing the Infini-attention method. The trio discusses how this approach works, including how it uses linear attention and employs compressive memory to store key-value pairs, enabling models to handle extensive contexts.

We also extend a special thank you to the research team Google for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2404.07143.pdf

For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at thepaperclub@deeperinsights.com.

Show more Show less

Science

Show more Show less

What listeners say about 16: Infini-Attention: Google's Solution for Infinite Memory in LLMs

Average customer ratings

Reviews - Please select the tabs below to change the source of reviews.

Audible.com reviews

Amazon reviews

No Reviews are Available

Report a review on Amazon