information-theory

1 article
sort: new top best
clear filter
0 2/10

This ICLR 2026 paper frames large language model training as lossy compression, demonstrating that LLMs learn optimal compressions of training data for next-sequence prediction that approach Information Bottleneck theoretical bounds. The work shows that compression quality and structure can predict downstream benchmark performance across different model families, providing an information-theoretic framework for understanding LLM learning and representational spaces.

ICLR 2026 Henry Conklin Tom Hosking Tan Yi-Chern Jonathan D. Cohen Sarah-Jane Leslie Thomas L. Griffiths Max Bartolo Seraphina Goldfarb-Tarrant OpenReview
openreview.net · pera · 1 day ago · details · hn