Communication Efficient Distributed Training with Distributed Lion
The Lion optimizer has been a promising competitor with the AdamW for training large AI
models, with advantages on memory, computation, and sample efficiency. In this paper, we …
models, with advantages on memory, computation, and sample efficiency. In this paper, we …