The impact of reinitialization on generalization in convolutional neural networks

G Sokar, R Agarwal, PS Castro… - … Conference on Machine …, 2023 - proceedings.mlr.press

In this work we identify the dormant neuron phenomenon in deep reinforcement learning,
where an agent's network suffers from an increasing number of inactive neurons, thereby …

被引用次数：83 相关文章所有 6 个版本

[PDF] neurips.cc

Improving language plasticity via pretraining with active forgetting

Y Chen, K Marchisio, R Raileanu… - Advances in …, 2023 - proceedings.neurips.cc

Pretrained language models (PLMs) are today the primary model for natural language
processing. Despite their impressive downstream performance, it can be difficult to apply …

被引用次数：21 相关文章所有 6 个版本

[PDF] arxiv.org

Fortuitous forgetting in connectionist networks

H Zhou, A Vani, H Larochelle, A Courville - arXiv preprint arXiv …, 2022 - arxiv.org

Forgetting is often seen as an unwanted characteristic in both human and machine learning.
However, we propose that forgetting can in fact be favorable to learning. We introduce" …

被引用次数：43 相关文章所有 2 个版本

[PDF] arxiv.org

Diagnosing and re-learning for balanced multimodal learning

Y Wei, S Li, R Feng, D Hu - European Conference on Computer Vision, 2025 - Springer

To overcome the imbalanced multimodal learning problem, where models prefer the training
of specific modalities, existing methods propose to control the training of uni-modal …

被引用次数：3 相关文章所有 6 个版本

[PDF] thecvf.com

RepAn: Enhanced Annealing through Re-parameterization

X Fei, X Zheng, Y Wang, F Chao… - Proceedings of the …, 2024 - openaccess.thecvf.com

The simulated annealing algorithm aims to improve model convergence through multiple
restarts of training. However existing annealing algorithms overlook the correlation between …

When Does Re-initialization Work?

S Zaidi, T Berariu, H Kim, J Bornschein… - Proceedings …, 2023 - proceedings.mlr.press

Re-initializing a neural network during training has been observed to improve generalization
in recent works. Yet it is neither widely adopted in deep learning practice nor is it often used …

被引用次数：17 相关文章所有 6 个版本

[PDF] arxiv.org

ReFine: Re-randomization before Fine-tuning for Cross-domain Few-shot Learning

J Oh, S Kim, N Ho, JH Kim, H Song… - Proceedings of the 31st …, 2022 - dl.acm.org

Cross-domain few-shot learning (CD-FSL), where there are few target samples under
extreme differences between source and target domains, has recently attracted huge …

被引用次数：12 相关文章所有 5 个版本

Deepfakes audio detection leveraging audio spectrogram and convolutional neural networks

TM Wani, I Amerini - International Conference on Image Analysis and …, 2023 - Springer

The proliferation of algorithms and commercial tools for the creation of synthetic audio has
resulted in a significant increase in the amount of inaccurate information, particularly on …

被引用次数：9 相关文章所有 3 个版本

[PDF] kaist.ac.kr

Towards cross domain generalization of Hamiltonian representation via meta learning

Y Song, H Jeong - ICLR 2024, The Twelfth International …, 2024 - koasas.kaist.ac.kr

Recent advances in deep learning for physics have focused on discovering shared
representations of target systems by incorporating physics priors or inductive biases into …

被引用次数：2 相关文章

[PDF] iospress.nl

Reset it and forget it: Relearning last-layer weights improves continual and transfer learning

L Frati, N Traft, J Clune, N Cheney - ECAI 2024, 2024 - ebooks.iospress.nl

This work identifies a simple pre-training mechanism that leads to representations exhibiting
better continual and transfer learning. This mechanism—the repeated resetting of weights in …

被引用次数：1 相关文章所有 2 个版本