On the geometry of generalization and memorization in deep neural networks

N De Cao, W Aziz, I Titov - arXiv preprint arXiv:2104.08164, 2021 - arxiv.org

The factual knowledge acquired during pre-training and stored in the parameters of
Language Models (LMs) can be useful in downstream tasks (eg, question answering or …

被引用次数：414 相关文章所有 6 个版本

[PDF] mlr.press

Balancing discriminability and transferability for source-free domain adaptation

JN Kundu, AR Kulkarni, S Bhambri… - International …, 2022 - proceedings.mlr.press

Conventional domain adaptation (DA) techniques aim to improve domain transferability by
learning domain-invariant representations; while concurrently preserving the task …

被引用次数：97 相关文章所有 4 个版本

[PDF] arxiv.org

Deduplicating training data makes language models better

K Lee, D Ippolito, A Nystrom, C Zhang, D Eck… - arXiv preprint arXiv …, 2021 - arxiv.org

We find that existing language modeling datasets contain many near-duplicate examples
and long repetitive substrings. As a result, over 1% of the unprompted output of language …

被引用次数：565 相关文章所有 7 个版本

[PDF] sciencedirect.com

From lazy to rich to exclusive task representations in neural networks and neural codes

M Farrell, S Recanatesi, E Shea-Brown - Current opinion in neurobiology, 2023 - Elsevier

Neural circuits—both in the brain and in “artificial” neural network models—learn to solve a
remarkable variety of tasks, and there is a great current opportunity to use neural networks …

被引用次数：11 相关文章所有 3 个版本

[PDF] aaai.org

Fast machine unlearning without retraining through selective synaptic dampening

J Foster, S Schoepf, A Brintrup - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

Machine unlearning, the ability for a machine learning model to forget, is becoming
increasingly important to comply with data privacy regulations, as well as to remove harmful …

被引用次数：63 相关文章所有 5 个版本

[PDF] neurips.cc

Subsidiary prototype alignment for universal domain adaptation

JN Kundu, S Bhambri, AR Kulkarni… - Advances in …, 2022 - proceedings.neurips.cc

Abstract Universal Domain Adaptation (UniDA) deals with the problem of knowledge transfer
between two datasets with domain-shift as well as category-shift. The goal is to categorize …

被引用次数：23 相关文章所有 5 个版本

[PDF] arxiv.org

Feddefender: Client-side attack-tolerant federated learning

S Park, S Han, F Wu, S Kim, B Zhu, X Xie… - Proceedings of the 29th …, 2023 - dl.acm.org

Federated learning enables learning from decentralized data sources without compromising
privacy, which makes it a crucial technique. However, it is vulnerable to model poisoning …

被引用次数：21 相关文章所有 5 个版本

[PDF] mlr.press

Sparse double descent: Where network pruning aggravates overfitting

Z He, Z Xie, Q Zhu, Z Qin - International Conference on …, 2022 - proceedings.mlr.press

People usually believe that network pruning not only reduces the computational cost of deep
networks, but also prevents overfitting by decreasing model capacity. However, our work …

被引用次数：34 相关文章所有 5 个版本

[PDF] neurips.cc

On memorization in probabilistic deep generative models

G van den Burg, C Williams - Advances in Neural …, 2021 - proceedings.neurips.cc

Recent advances in deep generative models have led to impressive results in a variety of
application domains. Motivated by the possibility that deep learning models might memorize …

被引用次数：62 相关文章所有 10 个版本

[PDF] arxiv.org

Task-aware information routing from common representation space in lifelong learning

P Bhat, B Zonooz, E Arani - arXiv preprint arXiv:2302.11346, 2023 - arxiv.org

Intelligent systems deployed in the real world suffer from catastrophic forgetting when
exposed to a sequence of tasks. Humans, on the other hand, acquire, consolidate, and …

被引用次数：17 相关文章所有 4 个版本