Continual Learning and Memory (2): Memory Architecture in Language Models
Published:
In our first post, we recapped the formulations of two continual learning papers. Both approaches conduct test-time learning to compress raw input into model parameters, effectively using those parameters as a dynamic working memory. In this post, we step back to examine the evolution of language models through the lens of memory.
