A primer on the inner workings of transformer-based language models
The rapid progress of research aimed at interpreting the inner workings of advanced
language models has highlighted a need for contextualizing the insights gained from years …
language models has highlighted a need for contextualizing the insights gained from years …
Neuron-level knowledge attribution in large language models
Z Yu, S Ananiadou - Proceedings of the 2024 Conference on …, 2024 - aclanthology.org
Identifying important neurons for final predictions is essential for understanding the
mechanisms of large language models. Due to computational constraints, current attribution …
mechanisms of large language models. Due to computational constraints, current attribution …
On logical extrapolation for mazes with recurrent and implicit networks
B Knutson, AC Rabeendran, M Ivanitskiy… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent work has suggested that certain neural network architectures-particularly recurrent
neural networks (RNNs) and implicit neural networks (INNs) are capable of logical …
neural networks (RNNs) and implicit neural networks (INNs) are capable of logical …
Encourage or inhibit monosemanticity? revisit monosemanticity from a feature decorrelation perspective
To better interpret the intrinsic mechanism of large language models (LLMs), recent studies
focus on monosemanticity on its basic units. A monosemantic neuron is dedicated to a …
focus on monosemanticity on its basic units. A monosemantic neuron is dedicated to a …
Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models
This paper introduces an efficient and robust method for discovering interpretable circuits in
large language models using discrete sparse autoencoders. Our approach addresses key …
large language models using discrete sparse autoencoders. Our approach addresses key …
Sparse Prototype Network for Explainable Pedestrian Behavior Prediction
Y Feng, A Carballo, K Takeda - arXiv preprint arXiv:2410.12195, 2024 - arxiv.org
Predicting pedestrian behavior is challenging yet crucial for applications such as
autonomous driving and smart city. Recent deep learning models have achieved …
autonomous driving and smart city. Recent deep learning models have achieved …
Local Sparse Representations: Connections With the Delaunay Triangulation and Dictionary Learning in Wasserstein Space
M Mueller - 2024 - search.proquest.com
We pursue local sparse representations of data by considering a common data model where
representations are formed as a combination of atoms that we call a dictionary. Our focus is …
representations are formed as a combination of atoms that we call a dictionary. Our focus is …
[PDF][PDF] Neuron-Level Knowledge Attribution in Large Language Models
ZYS Ananiadou - … .porno.michellesellsvictoria.com
Identifying important neurons for final predictions is essential for understanding the
mechanisms of large language models. Due to computational constraints, current attribution …
mechanisms of large language models. Due to computational constraints, current attribution …