Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arXiv preprint arXiv …, 2024 - arxiv.org
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

Scaling and renormalization in high-dimensional regression

A Atanasov, JA Zavatone-Veth, C Pehlevan - arXiv preprint arXiv …, 2024 - arxiv.org
This paper presents a succinct derivation of the training and generalization performance of a
variety of high-dimensional ridge regression models using the basic tools of random matrix …

A dynamical model of neural scaling laws

B Bordelon, A Atanasov, C Pehlevan - arXiv preprint arXiv:2402.01092, 2024 - arxiv.org
On a variety of tasks, the performance of neural networks predictably improves with training
time, dataset size and model size across many orders of magnitude. This phenomenon is …

On the Parameterization of Second-Order Optimization Effective Towards the Infinite Width

S Ishikawa, R Karakida - arXiv preprint arXiv:2312.12226, 2023 - arxiv.org
Second-order optimization has been developed to accelerate the training of deep neural
networks and it is being applied to increasingly larger-scale models. In this study, towards …

Steering Deep Feature Learning with Backward Aligned Feature Updates

L Chizat, P Netrapalli - arXiv preprint arXiv:2311.18718, 2023 - arxiv.org
Deep learning succeeds by doing hierarchical feature learning, yet tuning Hyper-
Parameters (HP) such as initialization scales, learning rates etc., only give indirect control …

Self-consistent dynamical field theory of kernel evolution in wide neural networks

B Bordelon, C Pehlevan - Journal of Statistical Mechanics: Theory …, 2023 - iopscience.iop.org
We analyze feature learning in infinite-width neural networks trained with gradient flow
through a self-consistent dynamical field theory. We construct a collection of deterministic …

Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning

L Noci, A Meterez, T Hofmann, A Orvieto - arXiv preprint arXiv:2402.17457, 2024 - arxiv.org
Recently, there has been growing evidence that if the width and depth of a neural network
are scaled toward the so-called rich feature learning limit ($\mu $ P and its depth extension) …

Visualising Feature Learning in Deep Neural Networks by Diagonalizing the Forward Feature Map

Y Nam, C Mingard, SH Lee, S Hayou… - arXiv preprint arXiv …, 2024 - arxiv.org
Deep neural networks (DNNs) exhibit a remarkable ability to automatically learn data
representations, finding appropriate features without human input. Here we present a …

Flexible infinite-width graph convolutional networks and the importance of representation learning

B Anson, E Milsom, L Aitchison - arXiv preprint arXiv:2402.06525, 2024 - arxiv.org
A common theoretical approach to understanding neural networks is to take an infinite-width
limit, at which point the outputs become Gaussian process (GP) distributed. This is known as …

PhiNets: Brain-inspired Non-contrastive Learning Based on Temporal Prediction Hypothesis

S Ishikawa, M Yamada, H Bao, Y Takezawa - arXiv preprint arXiv …, 2024 - arxiv.org
SimSiam is a prominent self-supervised learning method that achieves impressive results in
various vision tasks under static environments. However, it has two critical issues: high …