The dormant neuron phenomenon in deep reinforcement learning
In this work we identify the dormant neuron phenomenon in deep reinforcement learning,
where an agent's network suffers from an increasing number of inactive neurons, thereby …
where an agent's network suffers from an increasing number of inactive neurons, thereby …
Improving language plasticity via pretraining with active forgetting
Y Chen, K Marchisio, R Raileanu… - Advances in …, 2023 - proceedings.neurips.cc
Pretrained language models (PLMs) are today the primary model for natural language
processing. Despite their impressive downstream performance, it can be difficult to apply …
processing. Despite their impressive downstream performance, it can be difficult to apply …
Fortuitous forgetting in connectionist networks
Forgetting is often seen as an unwanted characteristic in both human and machine learning.
However, we propose that forgetting can in fact be favorable to learning. We introduce" …
However, we propose that forgetting can in fact be favorable to learning. We introduce" …
Diagnosing and re-learning for balanced multimodal learning
To overcome the imbalanced multimodal learning problem, where models prefer the training
of specific modalities, existing methods propose to control the training of uni-modal …
of specific modalities, existing methods propose to control the training of uni-modal …
RepAn: Enhanced Annealing through Re-parameterization
The simulated annealing algorithm aims to improve model convergence through multiple
restarts of training. However existing annealing algorithms overlook the correlation between …
restarts of training. However existing annealing algorithms overlook the correlation between …
When Does Re-initialization Work?
Re-initializing a neural network during training has been observed to improve generalization
in recent works. Yet it is neither widely adopted in deep learning practice nor is it often used …
in recent works. Yet it is neither widely adopted in deep learning practice nor is it often used …
ReFine: Re-randomization before Fine-tuning for Cross-domain Few-shot Learning
Cross-domain few-shot learning (CD-FSL), where there are few target samples under
extreme differences between source and target domains, has recently attracted huge …
extreme differences between source and target domains, has recently attracted huge …
Deepfakes audio detection leveraging audio spectrogram and convolutional neural networks
The proliferation of algorithms and commercial tools for the creation of synthetic audio has
resulted in a significant increase in the amount of inaccurate information, particularly on …
resulted in a significant increase in the amount of inaccurate information, particularly on …
Towards cross domain generalization of Hamiltonian representation via meta learning
Recent advances in deep learning for physics have focused on discovering shared
representations of target systems by incorporating physics priors or inductive biases into …
representations of target systems by incorporating physics priors or inductive biases into …
Reset it and forget it: Relearning last-layer weights improves continual and transfer learning
This work identifies a simple pre-training mechanism that leads to representations exhibiting
better continual and transfer learning. This mechanism—the repeated resetting of weights in …
better continual and transfer learning. This mechanism—the repeated resetting of weights in …