The dormant neuron phenomenon in deep reinforcement learning

G Sokar, R Agarwal, PS Castro… - … Conference on Machine …, 2023 - proceedings.mlr.press
In this work we identify the dormant neuron phenomenon in deep reinforcement learning,
where an agent's network suffers from an increasing number of inactive neurons, thereby …

Improving language plasticity via pretraining with active forgetting

Y Chen, K Marchisio, R Raileanu… - Advances in …, 2023 - proceedings.neurips.cc
Pretrained language models (PLMs) are today the primary model for natural language
processing. Despite their impressive downstream performance, it can be difficult to apply …

Fortuitous forgetting in connectionist networks

H Zhou, A Vani, H Larochelle, A Courville - arXiv preprint arXiv …, 2022 - arxiv.org
Forgetting is often seen as an unwanted characteristic in both human and machine learning.
However, we propose that forgetting can in fact be favorable to learning. We introduce" …

Diagnosing and re-learning for balanced multimodal learning

Y Wei, S Li, R Feng, D Hu - European Conference on Computer Vision, 2025 - Springer
To overcome the imbalanced multimodal learning problem, where models prefer the training
of specific modalities, existing methods propose to control the training of uni-modal …

RepAn: Enhanced Annealing through Re-parameterization

X Fei, X Zheng, Y Wang, F Chao… - Proceedings of the …, 2024 - openaccess.thecvf.com
The simulated annealing algorithm aims to improve model convergence through multiple
restarts of training. However existing annealing algorithms overlook the correlation between …

When Does Re-initialization Work?

S Zaidi, T Berariu, H Kim, J Bornschein… - Proceedings …, 2023 - proceedings.mlr.press
Re-initializing a neural network during training has been observed to improve generalization
in recent works. Yet it is neither widely adopted in deep learning practice nor is it often used …

ReFine: Re-randomization before Fine-tuning for Cross-domain Few-shot Learning

J Oh, S Kim, N Ho, JH Kim, H Song… - Proceedings of the 31st …, 2022 - dl.acm.org
Cross-domain few-shot learning (CD-FSL), where there are few target samples under
extreme differences between source and target domains, has recently attracted huge …

Deepfakes audio detection leveraging audio spectrogram and convolutional neural networks

TM Wani, I Amerini - International Conference on Image Analysis and …, 2023 - Springer
The proliferation of algorithms and commercial tools for the creation of synthetic audio has
resulted in a significant increase in the amount of inaccurate information, particularly on …

Towards cross domain generalization of Hamiltonian representation via meta learning

Y Song, H Jeong - ICLR 2024, The Twelfth International …, 2024 - koasas.kaist.ac.kr
Recent advances in deep learning for physics have focused on discovering shared
representations of target systems by incorporating physics priors or inductive biases into …

Reset it and forget it: Relearning last-layer weights improves continual and transfer learning

L Frati, N Traft, J Clune, N Cheney - ECAI 2024, 2024 - ebooks.iospress.nl
This work identifies a simple pre-training mechanism that leads to representations exhibiting
better continual and transfer learning. This mechanism—the repeated resetting of weights in …