Arxiv paper - OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization Podcast Por  arte de portada

Arxiv paper - OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization

Arxiv paper - OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization

Escúchala gratis

Ver detalles del espectáculo

Acerca de esta escucha

In this episode, we discuss OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization by Yiyou Sun, Shawn Hu, Georgia Zhou, Ken Zheng, Hannaneh Hajishirzi, Nouha Dziri, Dawn Song. The paper introduces OMEGA, a new benchmark to evaluate large language models' out-of-distribution generalization on math problems along three creativity-inspired axes: exploratory, compositional, and transformative reasoning. Evaluations reveal that state-of-the-art LLMs struggle increasingly with problem complexity, especially in compositional and transformative reasoning. Fine-tuning improves exploratory skills but not the other two, highlighting challenges in achieving genuine mathematical creativity beyond routine problem-solving.
Todavía no hay opiniones