When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

TY Chang, J Thomason, R Jia - arXiv preprint arXiv:2406.13131, 2024 - arxiv.org
This paper studies in-context learning (ICL) by decomposing the output of large language
models into the individual contributions of attention heads and MLPs (components). We …

Learned feature representations are biased by complexity, learning order, position, and more

AK Lampinen, SCY Chan, K Hermann - arXiv preprint arXiv:2405.05847, 2024 - arxiv.org
Representation learning, and interpreting learned representations, are key areas of focus in
machine learning and neuroscience. Both fields generally use representations as a means …