[HTML][HTML] Subspace distillation for continual learning
Neural Networks, 2023•Elsevier
An ultimate objective in continual learning is to preserve knowledge learned in preceding
tasks while learning new tasks. To mitigate forgetting prior knowledge, we propose a novel
knowledge distillation technique that takes into the account the manifold structure of the
latent/output space of a neural network in learning novel tasks. To achieve this, we propose
to approximate the data manifold up-to its first order, hence benefiting from linear subspaces
to model the structure and maintain the knowledge of a neural network while learning novel …
tasks while learning new tasks. To mitigate forgetting prior knowledge, we propose a novel
knowledge distillation technique that takes into the account the manifold structure of the
latent/output space of a neural network in learning novel tasks. To achieve this, we propose
to approximate the data manifold up-to its first order, hence benefiting from linear subspaces
to model the structure and maintain the knowledge of a neural network while learning novel …
Abstract
An ultimate objective in continual learning is to preserve knowledge learned in preceding tasks while learning new tasks. To mitigate forgetting prior knowledge, we propose a novel knowledge distillation technique that takes into the account the manifold structure of the latent/output space of a neural network in learning novel tasks. To achieve this, we propose to approximate the data manifold up-to its first order, hence benefiting from linear subspaces to model the structure and maintain the knowledge of a neural network while learning novel concepts. We demonstrate that the modeling with subspaces provides several intriguing properties, including robustness to noise and therefore effective for mitigating Catastrophic Forgetting in continual learning. We also discuss and show how our proposed method can be adopted to address both classification and segmentation problems. Empirically, we observe that our proposed method outperforms various continual learning methods on several challenging datasets including Pascal VOC, and Tiny-Imagenet. Furthermore, we show how the proposed method can be seamlessly combined with existing learning approaches to improve their performances. The codes of this article will be available at https://github.com/csiro-robotics/SDCL.
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果