查看文章

mlr.press 中的 [PDF]

signSGD: Compressed optimisation for non-convex problems

作者

Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Animashree Anandkumar

发表日期

2018/7/3

研讨会论文

International Conference on Machine Learning

页码范围

560-569

出版商

PMLR

简介

Training large neural networks requires distributing learning across multiple workers, where the cost of communicating gradients can be a significant bottleneck. signSGD alleviates this problem by transmitting just the sign of each minibatch stochastic gradient. We prove that it can get the best of both worlds: compressed gradients and SGD-level convergence rate. The relative geometry of gradients, noise and curvature informs whether signSGD or SGD is theoretically better suited to a particular problem. On the practical side we find that the momentum counterpart of signSGD is able to match the accuracy and convergence speed of Adam on deep Imagenet models. We extend our theory to the distributed setting, where the parameter server uses majority vote to aggregate gradient signs from each worker enabling 1-bit compression of worker-server communication in both directions. Using a theorem by Gauss we prove that majority vote can achieve the same reduction in variance as full precision distributed SGD. Thus, there is great promise for sign-based optimisation schemes to achieve fast communication and fast convergence. Code to reproduce experiments is to be found at https://github. com/jxbz/signSGD.

引用总数

被引用次数：1171

201820192020202120222023202411 64 169 229 242 295 159

学术搜索中的文章

signSGD: Compressed optimisation for non-convex problems

J Bernstein, YX Wang, K Azizzadenesheli… - International Conference on Machine Learning, 2018

被引用次数：1044 相关文章所有 12 个版本

signSGD with majority vote is communication efficient and fault tolerant*

J Bernstein, J Zhao, K Azizzadenesheli, A Anandkumar - arXiv preprint arXiv:1810.05291, 2018

被引用次数：199 相关文章所有 7 个版本

Convergence rate of sign stochastic gradient descent for non-convex functions*

J Bernstein, K Azizzadenesheli, YX Wang… - 2018

J Bernstein, YX Wang, K Azizzadenesheli… - arXiv preprint arXiv:1802.04434, 2018

J Bernstein, J Zhao, K Azizzadenesheli, A Anandkumar - arXiv preprint arXiv:1810.05291, 2018

J Bernstein, YX Wang, AI Amazon, P Alto… - 2017