Arxiv paper - How much do language models memorize? Podcast Por  arte de portada

Arxiv paper - How much do language models memorize?

Arxiv paper - How much do language models memorize?

Escúchala gratis

Ver detalles del espectáculo

Acerca de esta escucha

In this episode, we discuss How much do language models memorize? by John X. Morris, Chawin Sitawarin, Chuan Guo, Narine Kokhlikyan, G. Edward Suh, Alexander M. Rush, Kamalika Chaudhuri, Saeed Mahloujifar. The paper introduces a method to quantify how much a language model memorizes versus generalizes from data, defining model capacity as total memorization excluding generalization. Through extensive experiments on GPT-family models of varying sizes, the authors find that models memorize data until their capacity is full, after which generalization (or "grokking") increases and unintended memorization decreases. They establish scaling laws linking model capacity, data size, and membership inference, estimating GPT models have about 3.6 bits-per-parameter capacity.
Todavía no hay opiniones