Do wide and deep networks learn the same things? uncovering how neural network representations vary with width and depth

T Nguyen, M Raghu, S Kornblith - arXiv preprint arXiv:2010.15327, 2020 - arxiv.org
A key factor in the success of deep neural networks is the ability to scale models to improve
performance by varying the architecture depth and width. This simple property of neural …

Deep learning through the lens of example difficulty

R Baldock, H Maennel… - Advances in Neural …, 2021 - proceedings.neurips.cc
Existing work on understanding deep learning often employs measures that compress all
data-dependent information into a few numbers. In this work, we adopt a perspective based …

Membership inference attacks by exploiting loss trajectory

Y Liu, Z Zhao, M Backes, Y Zhang - Proceedings of the 2022 ACM …, 2022 - dl.acm.org
Machine learning models are vulnerable to membership inference attacks in which an
adversary aims to predict whether or not a particular sample was contained in the target …

Active learning on a budget: Opposite strategies suit high and low budgets

G Hacohen, A Dekel, D Weinshall - arXiv preprint arXiv:2202.02794, 2022 - arxiv.org
Investigating active learning, we focus on the relation between the number of labeled
examples (budget size), and suitable querying strategies. Our theoretical analysis shows a …

[HTML][HTML] A wholistic view of continual learning with deep neural networks: Forgotten lessons and the bridge to active and open world learning

M Mundt, Y Hong, I Pliushch, V Ramesh - Neural Networks, 2023 - Elsevier
Current deep learning methods are regarded as favorable if they empirically perform well on
dedicated test sets. This mentality is seamlessly reflected in the resurfacing area of continual …

Mechanistic Interpretability for AI Safety--A Review

L Bereska, E Gavves - arXiv preprint arXiv:2404.14082, 2024 - arxiv.org
Understanding AI systems' inner workings is critical for ensuring value alignment and safety.
This review explores mechanistic interpretability: reverse-engineering the computational …

The lean data scientist: recent advances toward overcoming the data bottleneck

C Shani, J Zarecki, D Shahaf - Communications of the ACM, 2023 - dl.acm.org
The Lean Data Scientist: Recent Advances Toward Overcoming the Data Bottleneck Page 1
OBTAINING DATA HAS become the key bottleneck in many machine-learning (ML) applications …

Characterizing datapoints via second-split forgetting

P Maini, S Garg, Z Lipton… - Advances in Neural …, 2022 - proceedings.neurips.cc
Researchers investigating example hardness have increasingly focused on the dynamics by
which neural networks learn and forget examples throughout training. Popular metrics …

Fusing finetuned models for better pretraining

L Choshen, E Venezian, N Slonim, Y Katz - arXiv preprint arXiv …, 2022 - arxiv.org
Pretrained models are the standard starting point for training. This approach consistently
outperforms the use of a random initialization. However, pretraining is a costly endeavour …

Do input gradients highlight discriminative features?

H Shah, P Jain, P Netrapalli - Advances in Neural …, 2021 - proceedings.neurips.cc
Post-hoc gradient-based interpretability methods [Simonyan et al., 2013, Smilkov et al.,
2017] that provide instance-specific explanations of model predictions are often based on …