Neural architecture search for parameter-efficient fine-tuning of large pre-trained language models
Parameter-efficient tuning (PET) methods fit pre-trained language models (PLMs) to
downstream tasks by either computing a small compressed update for a subset of model
parameters, or appending and fine-tuning a small number of new model parameters to the
pre-trained network. Hand-designed PET architectures from the literature perform well in
practice, but have the potential to be improved via automated neural architecture search
(NAS). We propose an efficient NAS method for learning PET architectures via structured …
downstream tasks by either computing a small compressed update for a subset of model
parameters, or appending and fine-tuning a small number of new model parameters to the
pre-trained network. Hand-designed PET architectures from the literature perform well in
practice, but have the potential to be improved via automated neural architecture search
(NAS). We propose an efficient NAS method for learning PET architectures via structured …
Parameter-efficient fine-tuning of large-scale pre-trained language models
With the prevalence of pre-trained language models (PLMs) and the pre-training–fine-tuning
paradigm, it has been continuously shown that larger models tend to yield better
performance. However, as PLMs scale up, fine-tuning and storing all the parameters is
prohibitively costly and eventually becomes practically infeasible. This necessitates a new
branch of research focusing on the parameter-efficient adaptation of PLMs, which optimizes
a small portion of the model parameters while keeping the rest fixed, drastically cutting down …
paradigm, it has been continuously shown that larger models tend to yield better
performance. However, as PLMs scale up, fine-tuning and storing all the parameters is
prohibitively costly and eventually becomes practically infeasible. This necessitates a new
branch of research focusing on the parameter-efficient adaptation of PLMs, which optimizes
a small portion of the model parameters while keeping the rest fixed, drastically cutting down …
以上显示的是最相近的搜索结果。 查看全部搜索结果