Loss Curve Approximations for Fast Neural Architecture Ranking

Loss Curve Approximations for Fast Neural Architecture Ranking & Training Elasticity Estimation

D Zhao, NC Frey, V Gadepally… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org

2022 IEEE International Parallel and Distributed Processing …, 2022•ieeexplore.ieee.org

Two key questions for any deep learning task involve questions around its optimization. First, when should we stop training or, alternatively, how long should we train for before the gains are not worth the continued training (i.e. early or optimal stopping)? Secondly, what is the “right” or best model: what training settings, hyper-parameters, and model architecture are best for the task at hand to maximize performance (i.e. architecture search)? Though essential, these questions are arguably also the most expensive parts of deep learning experimentation and the most unclear. Moreover, these expensive, exhaustive searches require large computational budgets that can carry large environmental footprints and significant energy expenditure. In this paper, we introduce a new method we call the Loss Curve Gradient Approximation (LCGA) that ranks model performance with minimal training. Using a wide variety of popular deep vision models, we test its predictive power and performance across different neural architectures and training settings. For a comparative analysis, we benchmark the performance of LCGA against an existing technique, Training Speed Estimation (TSE), used in architecture search and performance ranking and show that LCGA can significantly outperform TSE while still holding the same advantages in terms of ease, speed, and efficiency. Lastly, we describe potential applications of LCGA beyond its primary application: namely, (1) combining collected experimental data with LCGA to develop train-less NAS and (2) presenting a framework to more rigorously guide early stopping in training by borrowing concepts of demand elasticity from economics.

ieeexplore.ieee.org

展开收起

被引用次数：1 相关文章所有 2 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

Google学术搜索按钮

安装不用了

example.edu/paper.pdf

查找

获取 PDF 文件

引用

References