AttnLRP: Explainable AI for Transformers Podcast Por  arte de portada

AttnLRP: Explainable AI for Transformers

AttnLRP: Explainable AI for Transformers

Escúchala gratis

Ver detalles del espectáculo

This paper 2024 introduces AttnLRP, a novel method for explaining the internal reasoning of transformer models, including Large Language Models (LLMs) and Vision Transformers (ViTs). It extends Layer-wise Relevance Propagation (LRP) by introducing new rules for non-linear operations like softmax and matrix multiplication within attention layers, improving faithfulness and computational efficiency compared to existing methods. The paper highlights AttnLRP's ability to provide attributions for latent representations, enabling the identification and manipulation of "knowledge neurons" within these complex models. Experimental results demonstrate AttnLRP's superior performance across various benchmarks and model architectures.


Source: https://arxiv.org/pdf/2402.05602

Todavía no hay opiniones